instances using virtual router for DNS instead of DNS servers

2014-10-13 Thread Ian Young
When I set up CloudStack, I chose my physical DNS servers as both the
internal and external DNS servers.  They perform recursion, so they are
suitable for queries about hosts on the LAN as well as on the rest of the
internet.  However, the /etc/resolv.conf file in my instances lists the
virtual router first, followed by the physical servers I chose during
setup.  The virtual router does not successfully return answers about
internal hosts, causing the instances to be unable to reach each other.

I'm aware of the use.external.dns option but the last time I set that to
true and restarted the virtual router, it failed to start up again.  Why is
DHCP assigning the virtual router as the first name server instead of using
the ones I selected during setup?


Re: services not running after reboot

2014-10-13 Thread Ian Young
I didn't find anything like that.  Everything's been runnin ok over the
weekend so I will leave it as is.

On Mon, Oct 13, 2014 at 2:18 AM, Daan Hoogland 
wrote:

> Good going Ian, sorry you didn't get any assistance on the way. Did you
> find a setting that should have a different default? Like the router
> service offering memory :P or doesn't that make any sense?
>
> On Sat, Oct 11, 2014 at 5:11 AM, Ian Young  wrote:
>
> > Aha!  I restarted cloudstack-agent, which caused the virtual router to
> > change to a "stopped" status in the management console.  However, the
> > console viewer icon was still visible, so I clicked it.  The router had
> run
> > out of memory and caused a kernel panic.  I created a new system service
> > offering with 500 MB of memory, changed the router's service offering,
> and
> > started it.  It booted with no problem.  The default memory size of 128
> MB
> > is not enough.  This is the system VM template I was using:
> >
> >
> >
> http://cloudstack.apt-get.eu/systemvm/4.4/systemvm64template-4.4.0-6-kvm.qcow2.bz2
> >
> > On Fri, Oct 10, 2014 at 7:28 PM, Ian Young 
> wrote:
> >
> > > I dropped all the cloud* databases, deleted everything in primary and
> > > secondary storage, and reinstalled the management server, following the
> > > guide I wrote for myself the last time I built a stable CloudStack
> > system.
> > > Then I imported one of my backed up instances as a template and tried
> to
> > > create a new VM.  Same problem as before.  How is this possible?
> > >
> > > 2014-10-10 19:17:44,075 WARN  [kvm.resource.LibvirtComputingResource]
> > > (agentRequest-Handler-3:null) Timed out:
> > > /usr/share/cloudstack-common/scripts/vm/hypervisor/kvm/
> patchviasocket.pl
> > > -n r-4-VM -p
> > >
> >
> %template=domP%name=r-4-VM%eth0ip=192.168.102.222%eth0mask=255.255.255.0%gateway=192.168.102.1%domain=
> > > lax.ratespecial.com
> >
> %cidrsize=24%dhcprange=192.168.102.1%eth1ip=169.254.0.33%eth1mask=255.255.0.0%type=dhcpsrvr%disable_rp_filter=true%dns1=192.168.100.2%dns2=192.168.100.3
> > > .  Output is:
> > > 2014-10-10 19:18:05,078 WARN  [kvm.resource.LibvirtComputingResource]
> > > (Script-3:null) Interrupting script.
> > >
> > > On Fri, Oct 10, 2014 at 4:33 PM, Ian Young 
> > wrote:
> > >
> > >> I've restarted all the services and restarted the servers too.  The
> SSVM
> > >> and CP start with no trouble.  Every time I try to start or create an
> > >> instance, I see repeated messages like these:
> > >>
> > >> /var/log/cloudstack/agent/cloudstack-agent.out:
> > >> 2014-10-10 16:27:21,841{GMT} WARN
> > >>  [kvm.resource.LibvirtComputingResource] (Script-8:) Interrupting
> > script.
> > >> 2014-10-10 16:27:21,841{GMT} WARN
> > >>  [kvm.resource.LibvirtComputingResource] (agentRequest-Handler-4:)
> Timed
> > >> out: /usr/share/cloudstack-common/scripts/vm/hypervisor/kvm/
> > >> patchviasocket.pl -n r-19-VM -p
> > >>
> >
> %template=domP%name=r-19-VM%eth0ip=192.168.102.89%eth0mask=255.255.255.0%gateway=192.168.102.1%domain=
> > >> lax.ratespecial.com
> >
> %cidrsize=24%dhcprange=192.168.102.1%eth1ip=169.254.2.193%eth1mask=255.255.0.0%type=dhcpsrvr%disable_rp_filter=true%dns1=192.168.100.2%dns2=192.168.100.3
> > >> .  Output is:
> > >>
> > >> /var/log/cloudstack/agent/security_group.log:
> > >> 2014-10-10 16:27:33,259 - Failed to get rule logs, better luck next
> > time!
> > >>
> > >> On Fri, Oct 10, 2014 at 3:04 PM, Ian Young 
> > >> wrote:
> > >>
> > >>> I tried to restart the network with the "clean up" option, via the
> web
> > >>> console.  After several minutes, it failed to restart the network.
> The
> > >>> SSVM and CP are still running but the VR no longer exists.  Why would
> > these
> > >>> be able to start but not the virtual router?
> > >>>
> > >>> On Fri, Oct 10, 2014 at 2:48 PM, Ian Young 
> > >>> wrote:
> > >>>
> > >>>> I restarted the libvirtd service and the management service is now
> > >>>> fully started (there are services listening on ports 8250 and 9090).
> > The
> > >>>> SSVM health check script now reports no problems.
> > >>>>
> > >>>> However, I tried starting an instance and both t

Re: services not running after reboot

2014-10-10 Thread Ian Young
Aha!  I restarted cloudstack-agent, which caused the virtual router to
change to a "stopped" status in the management console.  However, the
console viewer icon was still visible, so I clicked it.  The router had run
out of memory and caused a kernel panic.  I created a new system service
offering with 500 MB of memory, changed the router's service offering, and
started it.  It booted with no problem.  The default memory size of 128 MB
is not enough.  This is the system VM template I was using:

http://cloudstack.apt-get.eu/systemvm/4.4/systemvm64template-4.4.0-6-kvm.qcow2.bz2

On Fri, Oct 10, 2014 at 7:28 PM, Ian Young  wrote:

> I dropped all the cloud* databases, deleted everything in primary and
> secondary storage, and reinstalled the management server, following the
> guide I wrote for myself the last time I built a stable CloudStack system.
> Then I imported one of my backed up instances as a template and tried to
> create a new VM.  Same problem as before.  How is this possible?
>
> 2014-10-10 19:17:44,075 WARN  [kvm.resource.LibvirtComputingResource]
> (agentRequest-Handler-3:null) Timed out:
> /usr/share/cloudstack-common/scripts/vm/hypervisor/kvm/patchviasocket.pl
> -n r-4-VM -p
> %template=domP%name=r-4-VM%eth0ip=192.168.102.222%eth0mask=255.255.255.0%gateway=192.168.102.1%domain=
> lax.ratespecial.com%cidrsize=24%dhcprange=192.168.102.1%eth1ip=169.254.0.33%eth1mask=255.255.0.0%type=dhcpsrvr%disable_rp_filter=true%dns1=192.168.100.2%dns2=192.168.100.3
> .  Output is:
> 2014-10-10 19:18:05,078 WARN  [kvm.resource.LibvirtComputingResource]
> (Script-3:null) Interrupting script.
>
> On Fri, Oct 10, 2014 at 4:33 PM, Ian Young  wrote:
>
>> I've restarted all the services and restarted the servers too.  The SSVM
>> and CP start with no trouble.  Every time I try to start or create an
>> instance, I see repeated messages like these:
>>
>> /var/log/cloudstack/agent/cloudstack-agent.out:
>> 2014-10-10 16:27:21,841{GMT} WARN
>>  [kvm.resource.LibvirtComputingResource] (Script-8:) Interrupting script.
>> 2014-10-10 16:27:21,841{GMT} WARN
>>  [kvm.resource.LibvirtComputingResource] (agentRequest-Handler-4:) Timed
>> out: /usr/share/cloudstack-common/scripts/vm/hypervisor/kvm/
>> patchviasocket.pl -n r-19-VM -p
>> %template=domP%name=r-19-VM%eth0ip=192.168.102.89%eth0mask=255.255.255.0%gateway=192.168.102.1%domain=
>> lax.ratespecial.com%cidrsize=24%dhcprange=192.168.102.1%eth1ip=169.254.2.193%eth1mask=255.255.0.0%type=dhcpsrvr%disable_rp_filter=true%dns1=192.168.100.2%dns2=192.168.100.3
>> .  Output is:
>>
>> /var/log/cloudstack/agent/security_group.log:
>> 2014-10-10 16:27:33,259 - Failed to get rule logs, better luck next time!
>>
>> On Fri, Oct 10, 2014 at 3:04 PM, Ian Young 
>> wrote:
>>
>>> I tried to restart the network with the "clean up" option, via the web
>>> console.  After several minutes, it failed to restart the network.  The
>>> SSVM and CP are still running but the VR no longer exists.  Why would these
>>> be able to start but not the virtual router?
>>>
>>> On Fri, Oct 10, 2014 at 2:48 PM, Ian Young 
>>> wrote:
>>>
>>>> I restarted the libvirtd service and the management service is now
>>>> fully started (there are services listening on ports 8250 and 9090).  The
>>>> SSVM health check script now reports no problems.
>>>>
>>>> However, I tried starting an instance and both the instance and the
>>>> virtual router are in a "starting" state but have been so for almost 10
>>>> minutes.  In the catalina.out log I see:
>>>> INFO  [c.c.v.VirtualMachineManagerImpl] (AgentManager-Handler-10:null)
>>>> There is pending job or HA tasks working on the VM. vm id: 4, postpone
>>>> power-change report by resetting power-change counters
>>>> INFO  [c.c.v.VirtualMachineManagerImpl] (AgentManager-Handler-10:null)
>>>> There is pending job or HA tasks working on the VM. vm id: 13, postpone
>>>> power-change report by resetting power-change counters
>>>>
>>>> I'm also seeing this in the agent.log:
>>>> 2014-10-10 14:43:26,833 WARN  [kvm.resource.LibvirtComputingResource]
>>>> (Script-6:null) Interrupting script.
>>>> 2014-10-10 14:43:26,833 WARN  [kvm.resource.LibvirtComputingResource]
>>>> (agentRequest-Handler-2:null) Timed out:
>>>> /usr/share/cloudstack-common/scripts/vm/hypervisor/kvm/
>>>> patchviasocket.pl -n r-4-VM -p
>>>> %template=domP%name=r-4-VM%eth0ip=192.168.102.110%eth0mask=255.255.255.0%gateway=192.168.102.1%domain=
>>>> lax.ra

Re: services not running after reboot

2014-10-10 Thread Ian Young
I dropped all the cloud* databases, deleted everything in primary and
secondary storage, and reinstalled the management server, following the
guide I wrote for myself the last time I built a stable CloudStack system.
Then I imported one of my backed up instances as a template and tried to
create a new VM.  Same problem as before.  How is this possible?

2014-10-10 19:17:44,075 WARN  [kvm.resource.LibvirtComputingResource]
(agentRequest-Handler-3:null) Timed out:
/usr/share/cloudstack-common/scripts/vm/hypervisor/kvm/patchviasocket.pl -n
r-4-VM -p
%template=domP%name=r-4-VM%eth0ip=192.168.102.222%eth0mask=255.255.255.0%gateway=192.168.102.1%domain=
lax.ratespecial.com%cidrsize=24%dhcprange=192.168.102.1%eth1ip=169.254.0.33%eth1mask=255.255.0.0%type=dhcpsrvr%disable_rp_filter=true%dns1=192.168.100.2%dns2=192.168.100.3
.  Output is:
2014-10-10 19:18:05,078 WARN  [kvm.resource.LibvirtComputingResource]
(Script-3:null) Interrupting script.

On Fri, Oct 10, 2014 at 4:33 PM, Ian Young  wrote:

> I've restarted all the services and restarted the servers too.  The SSVM
> and CP start with no trouble.  Every time I try to start or create an
> instance, I see repeated messages like these:
>
> /var/log/cloudstack/agent/cloudstack-agent.out:
> 2014-10-10 16:27:21,841{GMT} WARN  [kvm.resource.LibvirtComputingResource]
> (Script-8:) Interrupting script.
> 2014-10-10 16:27:21,841{GMT} WARN  [kvm.resource.LibvirtComputingResource]
> (agentRequest-Handler-4:) Timed out:
> /usr/share/cloudstack-common/scripts/vm/hypervisor/kvm/patchviasocket.pl
> -n r-19-VM -p
> %template=domP%name=r-19-VM%eth0ip=192.168.102.89%eth0mask=255.255.255.0%gateway=192.168.102.1%domain=
> lax.ratespecial.com%cidrsize=24%dhcprange=192.168.102.1%eth1ip=169.254.2.193%eth1mask=255.255.0.0%type=dhcpsrvr%disable_rp_filter=true%dns1=192.168.100.2%dns2=192.168.100.3
> .  Output is:
>
> /var/log/cloudstack/agent/security_group.log:
> 2014-10-10 16:27:33,259 - Failed to get rule logs, better luck next time!
>
> On Fri, Oct 10, 2014 at 3:04 PM, Ian Young  wrote:
>
>> I tried to restart the network with the "clean up" option, via the web
>> console.  After several minutes, it failed to restart the network.  The
>> SSVM and CP are still running but the VR no longer exists.  Why would these
>> be able to start but not the virtual router?
>>
>> On Fri, Oct 10, 2014 at 2:48 PM, Ian Young 
>> wrote:
>>
>>> I restarted the libvirtd service and the management service is now fully
>>> started (there are services listening on ports 8250 and 9090).  The SSVM
>>> health check script now reports no problems.
>>>
>>> However, I tried starting an instance and both the instance and the
>>> virtual router are in a "starting" state but have been so for almost 10
>>> minutes.  In the catalina.out log I see:
>>> INFO  [c.c.v.VirtualMachineManagerImpl] (AgentManager-Handler-10:null)
>>> There is pending job or HA tasks working on the VM. vm id: 4, postpone
>>> power-change report by resetting power-change counters
>>> INFO  [c.c.v.VirtualMachineManagerImpl] (AgentManager-Handler-10:null)
>>> There is pending job or HA tasks working on the VM. vm id: 13, postpone
>>> power-change report by resetting power-change counters
>>>
>>> I'm also seeing this in the agent.log:
>>> 2014-10-10 14:43:26,833 WARN  [kvm.resource.LibvirtComputingResource]
>>> (Script-6:null) Interrupting script.
>>> 2014-10-10 14:43:26,833 WARN  [kvm.resource.LibvirtComputingResource]
>>> (agentRequest-Handler-2:null) Timed out:
>>> /usr/share/cloudstack-common/scripts/vm/hypervisor/kvm/patchviasocket.pl
>>> -n r-4-VM -p
>>> %template=domP%name=r-4-VM%eth0ip=192.168.102.110%eth0mask=255.255.255.0%gateway=192.168.102.1%domain=
>>> lax.ratespecial.com%cidrsize=24%dhcprange=192.168.102.1%eth1ip=169.254.2.181%eth1mask=255.255.0.0%type=dhcpsrvr%disable_rp_filter=true%dns1=192.168.100.2%dns2=192.168.100.3
>>> .  Output is:
>>>
>>> And in the security_group.log:
>>> 2014-10-10 14:42:41,926 - Failed to get rule logs, better luck next time!
>>> 2014-10-10 14:43:41,926 - Failed to get rule logs, better luck next time!
>>>
>>> What does this mean?
>>>
>>> On Fri, Oct 10, 2014 at 2:11 PM, Ian Young 
>>> wrote:
>>>
>>>> This morning I was unable to start new instances.  I discovered that I
>>>> could SSH into the SSVM and the console proxy but not the virtual router.
>>>> Something strange was happening so I thought it might be a good time to
>>>> gracefully stop all the instances and reboot the hypervisor to see if the
>>>> VR wo

Re: services not running after reboot

2014-10-10 Thread Ian Young
I've restarted all the services and restarted the servers too.  The SSVM
and CP start with no trouble.  Every time I try to start or create an
instance, I see repeated messages like these:

/var/log/cloudstack/agent/cloudstack-agent.out:
2014-10-10 16:27:21,841{GMT} WARN  [kvm.resource.LibvirtComputingResource]
(Script-8:) Interrupting script.
2014-10-10 16:27:21,841{GMT} WARN  [kvm.resource.LibvirtComputingResource]
(agentRequest-Handler-4:) Timed out:
/usr/share/cloudstack-common/scripts/vm/hypervisor/kvm/patchviasocket.pl -n
r-19-VM -p
%template=domP%name=r-19-VM%eth0ip=192.168.102.89%eth0mask=255.255.255.0%gateway=192.168.102.1%domain=
lax.ratespecial.com%cidrsize=24%dhcprange=192.168.102.1%eth1ip=169.254.2.193%eth1mask=255.255.0.0%type=dhcpsrvr%disable_rp_filter=true%dns1=192.168.100.2%dns2=192.168.100.3
.  Output is:

/var/log/cloudstack/agent/security_group.log:
2014-10-10 16:27:33,259 - Failed to get rule logs, better luck next time!

On Fri, Oct 10, 2014 at 3:04 PM, Ian Young  wrote:

> I tried to restart the network with the "clean up" option, via the web
> console.  After several minutes, it failed to restart the network.  The
> SSVM and CP are still running but the VR no longer exists.  Why would these
> be able to start but not the virtual router?
>
> On Fri, Oct 10, 2014 at 2:48 PM, Ian Young  wrote:
>
>> I restarted the libvirtd service and the management service is now fully
>> started (there are services listening on ports 8250 and 9090).  The SSVM
>> health check script now reports no problems.
>>
>> However, I tried starting an instance and both the instance and the
>> virtual router are in a "starting" state but have been so for almost 10
>> minutes.  In the catalina.out log I see:
>> INFO  [c.c.v.VirtualMachineManagerImpl] (AgentManager-Handler-10:null)
>> There is pending job or HA tasks working on the VM. vm id: 4, postpone
>> power-change report by resetting power-change counters
>> INFO  [c.c.v.VirtualMachineManagerImpl] (AgentManager-Handler-10:null)
>> There is pending job or HA tasks working on the VM. vm id: 13, postpone
>> power-change report by resetting power-change counters
>>
>> I'm also seeing this in the agent.log:
>> 2014-10-10 14:43:26,833 WARN  [kvm.resource.LibvirtComputingResource]
>> (Script-6:null) Interrupting script.
>> 2014-10-10 14:43:26,833 WARN  [kvm.resource.LibvirtComputingResource]
>> (agentRequest-Handler-2:null) Timed out:
>> /usr/share/cloudstack-common/scripts/vm/hypervisor/kvm/patchviasocket.pl
>> -n r-4-VM -p
>> %template=domP%name=r-4-VM%eth0ip=192.168.102.110%eth0mask=255.255.255.0%gateway=192.168.102.1%domain=
>> lax.ratespecial.com%cidrsize=24%dhcprange=192.168.102.1%eth1ip=169.254.2.181%eth1mask=255.255.0.0%type=dhcpsrvr%disable_rp_filter=true%dns1=192.168.100.2%dns2=192.168.100.3
>> .  Output is:
>>
>> And in the security_group.log:
>> 2014-10-10 14:42:41,926 - Failed to get rule logs, better luck next time!
>> 2014-10-10 14:43:41,926 - Failed to get rule logs, better luck next time!
>>
>> What does this mean?
>>
>> On Fri, Oct 10, 2014 at 2:11 PM, Ian Young 
>> wrote:
>>
>>> This morning I was unable to start new instances.  I discovered that I
>>> could SSH into the SSVM and the console proxy but not the virtual router.
>>> Something strange was happening so I thought it might be a good time to
>>> gracefully stop all the instances and reboot the hypervisor to see if the
>>> VR would start working again.  I also rebooted the management server (a
>>> separate machine) to have a clean slate.  Now that they've both been
>>> rebooted, the following symptoms exist:
>>>
>>> * On the management server, there is no services listening on 9090 or
>>> 8250.
>>> * When I run the SSVM health check script, it says NFS is not currently
>>> mounted.
>>> * The management server log is reporting that Zone 1 is not ready to
>>> launch SSVM/CP yet, even though both of those are running.
>>>
>>> The NFS server is running just fine.  I can mount it in the management
>>> server with no problems.  I've restarted cloudstack-management and
>>> cloudstack-agent but the problems persist.  The "not ready to launch
>>> SSVM/CP yet" messages sounds like the management server and the hypervisor
>>> are not communicating or some information about the system state is out of
>>> sync.  How can I confirm this?
>>>
>>
>>
>


Re: services not running after reboot

2014-10-10 Thread Ian Young
I tried to restart the network with the "clean up" option, via the web
console.  After several minutes, it failed to restart the network.  The
SSVM and CP are still running but the VR no longer exists.  Why would these
be able to start but not the virtual router?

On Fri, Oct 10, 2014 at 2:48 PM, Ian Young  wrote:

> I restarted the libvirtd service and the management service is now fully
> started (there are services listening on ports 8250 and 9090).  The SSVM
> health check script now reports no problems.
>
> However, I tried starting an instance and both the instance and the
> virtual router are in a "starting" state but have been so for almost 10
> minutes.  In the catalina.out log I see:
> INFO  [c.c.v.VirtualMachineManagerImpl] (AgentManager-Handler-10:null)
> There is pending job or HA tasks working on the VM. vm id: 4, postpone
> power-change report by resetting power-change counters
> INFO  [c.c.v.VirtualMachineManagerImpl] (AgentManager-Handler-10:null)
> There is pending job or HA tasks working on the VM. vm id: 13, postpone
> power-change report by resetting power-change counters
>
> I'm also seeing this in the agent.log:
> 2014-10-10 14:43:26,833 WARN  [kvm.resource.LibvirtComputingResource]
> (Script-6:null) Interrupting script.
> 2014-10-10 14:43:26,833 WARN  [kvm.resource.LibvirtComputingResource]
> (agentRequest-Handler-2:null) Timed out:
> /usr/share/cloudstack-common/scripts/vm/hypervisor/kvm/patchviasocket.pl
> -n r-4-VM -p
> %template=domP%name=r-4-VM%eth0ip=192.168.102.110%eth0mask=255.255.255.0%gateway=192.168.102.1%domain=
> lax.ratespecial.com%cidrsize=24%dhcprange=192.168.102.1%eth1ip=169.254.2.181%eth1mask=255.255.0.0%type=dhcpsrvr%disable_rp_filter=true%dns1=192.168.100.2%dns2=192.168.100.3
> .  Output is:
>
> And in the security_group.log:
> 2014-10-10 14:42:41,926 - Failed to get rule logs, better luck next time!
> 2014-10-10 14:43:41,926 - Failed to get rule logs, better luck next time!
>
> What does this mean?
>
> On Fri, Oct 10, 2014 at 2:11 PM, Ian Young  wrote:
>
>> This morning I was unable to start new instances.  I discovered that I
>> could SSH into the SSVM and the console proxy but not the virtual router.
>> Something strange was happening so I thought it might be a good time to
>> gracefully stop all the instances and reboot the hypervisor to see if the
>> VR would start working again.  I also rebooted the management server (a
>> separate machine) to have a clean slate.  Now that they've both been
>> rebooted, the following symptoms exist:
>>
>> * On the management server, there is no services listening on 9090 or
>> 8250.
>> * When I run the SSVM health check script, it says NFS is not currently
>> mounted.
>> * The management server log is reporting that Zone 1 is not ready to
>> launch SSVM/CP yet, even though both of those are running.
>>
>> The NFS server is running just fine.  I can mount it in the management
>> server with no problems.  I've restarted cloudstack-management and
>> cloudstack-agent but the problems persist.  The "not ready to launch
>> SSVM/CP yet" messages sounds like the management server and the hypervisor
>> are not communicating or some information about the system state is out of
>> sync.  How can I confirm this?
>>
>
>


Re: services not running after reboot

2014-10-10 Thread Ian Young
I restarted the libvirtd service and the management service is now fully
started (there are services listening on ports 8250 and 9090).  The SSVM
health check script now reports no problems.

However, I tried starting an instance and both the instance and the virtual
router are in a "starting" state but have been so for almost 10 minutes.
In the catalina.out log I see:
INFO  [c.c.v.VirtualMachineManagerImpl] (AgentManager-Handler-10:null)
There is pending job or HA tasks working on the VM. vm id: 4, postpone
power-change report by resetting power-change counters
INFO  [c.c.v.VirtualMachineManagerImpl] (AgentManager-Handler-10:null)
There is pending job or HA tasks working on the VM. vm id: 13, postpone
power-change report by resetting power-change counters

I'm also seeing this in the agent.log:
2014-10-10 14:43:26,833 WARN  [kvm.resource.LibvirtComputingResource]
(Script-6:null) Interrupting script.
2014-10-10 14:43:26,833 WARN  [kvm.resource.LibvirtComputingResource]
(agentRequest-Handler-2:null) Timed out:
/usr/share/cloudstack-common/scripts/vm/hypervisor/kvm/patchviasocket.pl -n
r-4-VM -p
%template=domP%name=r-4-VM%eth0ip=192.168.102.110%eth0mask=255.255.255.0%gateway=192.168.102.1%domain=
lax.ratespecial.com%cidrsize=24%dhcprange=192.168.102.1%eth1ip=169.254.2.181%eth1mask=255.255.0.0%type=dhcpsrvr%disable_rp_filter=true%dns1=192.168.100.2%dns2=192.168.100.3
.  Output is:

And in the security_group.log:
2014-10-10 14:42:41,926 - Failed to get rule logs, better luck next time!
2014-10-10 14:43:41,926 - Failed to get rule logs, better luck next time!

What does this mean?

On Fri, Oct 10, 2014 at 2:11 PM, Ian Young  wrote:

> This morning I was unable to start new instances.  I discovered that I
> could SSH into the SSVM and the console proxy but not the virtual router.
> Something strange was happening so I thought it might be a good time to
> gracefully stop all the instances and reboot the hypervisor to see if the
> VR would start working again.  I also rebooted the management server (a
> separate machine) to have a clean slate.  Now that they've both been
> rebooted, the following symptoms exist:
>
> * On the management server, there is no services listening on 9090 or 8250.
> * When I run the SSVM health check script, it says NFS is not currently
> mounted.
> * The management server log is reporting that Zone 1 is not ready to
> launch SSVM/CP yet, even though both of those are running.
>
> The NFS server is running just fine.  I can mount it in the management
> server with no problems.  I've restarted cloudstack-management and
> cloudstack-agent but the problems persist.  The "not ready to launch
> SSVM/CP yet" messages sounds like the management server and the hypervisor
> are not communicating or some information about the system state is out of
> sync.  How can I confirm this?
>


services not running after reboot

2014-10-10 Thread Ian Young
This morning I was unable to start new instances.  I discovered that I
could SSH into the SSVM and the console proxy but not the virtual router.
Something strange was happening so I thought it might be a good time to
gracefully stop all the instances and reboot the hypervisor to see if the
VR would start working again.  I also rebooted the management server (a
separate machine) to have a clean slate.  Now that they've both been
rebooted, the following symptoms exist:

* On the management server, there is no services listening on 9090 or 8250.
* When I run the SSVM health check script, it says NFS is not currently
mounted.
* The management server log is reporting that Zone 1 is not ready to launch
SSVM/CP yet, even though both of those are running.

The NFS server is running just fine.  I can mount it in the management
server with no problems.  I've restarted cloudstack-management and
cloudstack-agent but the problems persist.  The "not ready to launch
SSVM/CP yet" messages sounds like the management server and the hypervisor
are not communicating or some information about the system state is out of
sync.  How can I confirm this?


Re: unable to start virtual router

2014-10-09 Thread Ian Young
I restarted the agent about a half dozen times and the router magically
started itself.  What's the best way to make my instances use our internal
DNS servers?

On Thu, Oct 9, 2014 at 3:33 PM, Ian Young  wrote:

> I wanted to bypass the virtual router as the first DNS server so that my
> instances would use our existing physical DNS servers.  I followed the
> instructions here:
>
> http://support.citrix.com/article/CTX138970
>
> I set "use.external.dns" to true and then restarted the virtual router.
> The VR remained in a "starting" state indefinitely.  Eventually it timed
> out.  Now I can't start the VR.  The management log says:
>
> 2014-10-09 14:00:13,275 ERROR [c.c.v.VirtualMachineManagerImpl]
> (Work-Job-Executor-1:ctx-74cbcbcf job-94/job-95 ctx-ead24324) Failed to
> start instance VM[DomainRouter|r-4-VM]
> 2014-10-09 14:00:13,636 ERROR [c.c.v.VmWorkJobHandlerProxy]
> (Work-Job-Executor-1:ctx-74cbcbcf job-94/job-95 ctx-ead24324) Invocation
> exception, caused by: com.cloud.exception.AgentUnavailableException:
> Resource [Host:1] is unreachable: Host 1: Unable to start instance due to
> Unable to start VM[DomainRouter|r-4-VM] due to error in finalizeStart, not
> retrying
> 2014-10-09 14:00:13,636 ERROR [c.c.v.VmWorkJobDispatcher]
> (Work-Job-Executor-1:ctx-74cbcbcf job-94/job-95) Unable to complete
> AsyncJobVO {id:95, userId: 2, accountId: 2, instanceType: null, instanceId:
> null, cmd: com.cloud.vm.VmWorkStart, cmdInfo:
> rO0ABXNyABhjb20uY2xvdWQudm0uVm1Xb3JrU3RhcnR9cMGsvxz73gIAC0oABGRjSWRMAAZhdm9pZHN0ADBMY29tL2Nsb3VkL2RlcGxveS9EZXBsb3ltZW50UGxhbm5lciRFeGNsdWRlTGlzdDtMAAljbHVzdGVySWR0ABBMamF2YS9sYW5nL0xvbmc7TAAGaG9zdElkcQB-AAJMAAtqb3VybmFsTmFtZXQAEkxqYXZhL2xhbmcvU3RyaW5nO0wAEXBoeXNpY2FsTmV0d29ya0lkcQB-AAJMAAdwbGFubmVycQB-AANMAAVwb2RJZHEAfgACTAAGcG9vbElkcQB-AAJMAAlyYXdQYXJhbXN0AA9MamF2YS91dGlsL01hcDtMAA1yZXNlcnZhdGlvbklkcQB-AAN4cgATY29tLmNsb3VkLnZtLlZtV29ya5-ZtlbwJWdrAgAESgAJYWNjb3VudElkSgAGdXNlcklkSgAEdm1JZEwAC2hhbmRsZXJOYW1lcQB-AAN4cAACAAIABHQAGVZpcnR1YWxNYWNoaW5lTWFuYWdlckltcGwAAHBwcHBwcHBwc3IAEWphdmEudXRpbC5IYXNoTWFwBQfawcMWYNEDAAJGAApsb2FkRmFjdG9ySQAJdGhyZXNob2xkeHA_QAAADHcIEAF0AA5SZXN0YXJ0TmV0d29ya3QAP3JPMEFCWE55QUJGcVlYWmhMbXhoYm1jdVFtOXZiR1ZoYnMwZ2NvRFZuUHJ1QWdBQldnQUZkbUZzZFdWNGNBRXhw,
> cmdVersion: 0, status: IN_PROGRESS, processStatus: 0, resultCode: 0,
> result: null, initMsid: 132241037805012, completeMsid: null, lastUpdated:
> null, lastPolled: null, created: Thu Oct 09 13:39:51 PDT 2014}, job
> origin:94
> 2014-10-09 14:00:13,763 ERROR [c.c.a.ApiAsyncJobDispatcher]
> (API-Job-Executor-1:ctx-e206f887 job-94) Unexpected exception while
> executing org.apache.cloudstack.api.command.admin.router.StartRouterCmd
>


unable to start virtual router

2014-10-09 Thread Ian Young
I wanted to bypass the virtual router as the first DNS server so that my
instances would use our existing physical DNS servers.  I followed the
instructions here:

http://support.citrix.com/article/CTX138970

I set "use.external.dns" to true and then restarted the virtual router.
The VR remained in a "starting" state indefinitely.  Eventually it timed
out.  Now I can't start the VR.  The management log says:

2014-10-09 14:00:13,275 ERROR [c.c.v.VirtualMachineManagerImpl]
(Work-Job-Executor-1:ctx-74cbcbcf job-94/job-95 ctx-ead24324) Failed to
start instance VM[DomainRouter|r-4-VM]
2014-10-09 14:00:13,636 ERROR [c.c.v.VmWorkJobHandlerProxy]
(Work-Job-Executor-1:ctx-74cbcbcf job-94/job-95 ctx-ead24324) Invocation
exception, caused by: com.cloud.exception.AgentUnavailableException:
Resource [Host:1] is unreachable: Host 1: Unable to start instance due to
Unable to start VM[DomainRouter|r-4-VM] due to error in finalizeStart, not
retrying
2014-10-09 14:00:13,636 ERROR [c.c.v.VmWorkJobDispatcher]
(Work-Job-Executor-1:ctx-74cbcbcf job-94/job-95) Unable to complete
AsyncJobVO {id:95, userId: 2, accountId: 2, instanceType: null, instanceId:
null, cmd: com.cloud.vm.VmWorkStart, cmdInfo:
rO0ABXNyABhjb20uY2xvdWQudm0uVm1Xb3JrU3RhcnR9cMGsvxz73gIAC0oABGRjSWRMAAZhdm9pZHN0ADBMY29tL2Nsb3VkL2RlcGxveS9EZXBsb3ltZW50UGxhbm5lciRFeGNsdWRlTGlzdDtMAAljbHVzdGVySWR0ABBMamF2YS9sYW5nL0xvbmc7TAAGaG9zdElkcQB-AAJMAAtqb3VybmFsTmFtZXQAEkxqYXZhL2xhbmcvU3RyaW5nO0wAEXBoeXNpY2FsTmV0d29ya0lkcQB-AAJMAAdwbGFubmVycQB-AANMAAVwb2RJZHEAfgACTAAGcG9vbElkcQB-AAJMAAlyYXdQYXJhbXN0AA9MamF2YS91dGlsL01hcDtMAA1yZXNlcnZhdGlvbklkcQB-AAN4cgATY29tLmNsb3VkLnZtLlZtV29ya5-ZtlbwJWdrAgAESgAJYWNjb3VudElkSgAGdXNlcklkSgAEdm1JZEwAC2hhbmRsZXJOYW1lcQB-AAN4cAACAAIABHQAGVZpcnR1YWxNYWNoaW5lTWFuYWdlckltcGwAAHBwcHBwcHBwc3IAEWphdmEudXRpbC5IYXNoTWFwBQfawcMWYNEDAAJGAApsb2FkRmFjdG9ySQAJdGhyZXNob2xkeHA_QAAADHcIEAF0AA5SZXN0YXJ0TmV0d29ya3QAP3JPMEFCWE55QUJGcVlYWmhMbXhoYm1jdVFtOXZiR1ZoYnMwZ2NvRFZuUHJ1QWdBQldnQUZkbUZzZFdWNGNBRXhw,
cmdVersion: 0, status: IN_PROGRESS, processStatus: 0, resultCode: 0,
result: null, initMsid: 132241037805012, completeMsid: null, lastUpdated:
null, lastPolled: null, created: Thu Oct 09 13:39:51 PDT 2014}, job
origin:94
2014-10-09 14:00:13,763 ERROR [c.c.a.ApiAsyncJobDispatcher]
(API-Job-Executor-1:ctx-e206f887 job-94) Unexpected exception while
executing org.apache.cloudstack.api.command.admin.router.StartRouterCmd


Re: upgraded to 4.4.1, management console is broken

2014-10-03 Thread Ian Young
Since the database had allegedly not been upgraded, I downgraded to 4.4.0.
The management console was now available but I can't start instances.
Thinking I would have to back up all the instances and reinstall CloudStack
from scratch again, I converted one of the instances to a template but was
unable to download it.  The browser times out while trying to connect to
the SSVM.  I can't connect to the SSVM using the RSA key either, although
the console proxy shows it up and running, with the login prompt.  I port
scanned the SSVM and the only ports open were 80 and 443, not 22.

On Fri, Oct 3, 2014 at 1:51 PM, Ian Young  wrote:

> It looks like this is the root of the problem:
>
> 2014-10-03 13:44:47,131 DEBUG [c.c.u.d.Upgrade440to441] (main:null)
> Updating System Vm template IDs
> 2014-10-03 13:44:47,136 DEBUG [c.c.u.d.Upgrade440to441] (main:null)
> Updating LXC System Vms
> 2014-10-03 13:44:47,137 WARN  [c.c.u.d.Upgrade440to441] (main:null) 4.4.0
> LXC SystemVm template not found. LXC hypervisor is not used, so not failing
> upgrade
> 2014-10-03 13:44:47,138 DEBUG [c.c.u.d.Upgrade440to441] (main:null)
> Updating KVM System Vms
> 2014-10-03 13:44:47,141 ERROR [c.c.u.DatabaseUpgradeChecker] (main:null)
> Unable to upgrade the database
> com.cloud.utils.exception.CloudRuntimeException: 4.4.0 KVM SystemVm
> template not found. Cannot upgrade system Vms
>
> Any ideas about how I can fix this?  I had a 4.4.0 KVM SystemVm template
> prior to the upgrade.
>
> On Fri, Oct 3, 2014 at 1:39 PM, Ian Young  wrote:
>
>> I tried upgrading from 4.4.0 to 4.4.1 but now I get a 404 error when I
>> try to access the management console.  The localhost.2014-10-03.log shows
>> this error:
>>
>> Caused by: java.io.IOException: Resource
>> [jar:file:/usr/share/cloudstack-management/webapps/client/WEB-INF/lib/cloud-api-4.4.1.jar!/META-INF/cloudstack/api-config/module.properties]
>> and
>> [jar:file:/usr/share/cloudstack-management/webapps/client/WEB-INF/lib/cloud-api-4.4.0.jar!/META-INF/cloudstack/api-config/module.properties]
>> do not appear to be the same resource, please ensure the name property is
>> correct or that the module is not defined twice
>>
>> Where is this value defined?  The full log can be viewed here:
>>
>> pastebin.com/nrdEsxZK
>>
>> I also noticed the error "4.4.0 KVM SystemVm template not found. Cannot
>> upgrade system Vms."  The URL for the system VM template in the 4.4.1
>> upgrade instructions is the same as the one I used when I installed 4.4.0
>> initially.  Is there really a need to install the same template again?
>>
>
>


Re: upgraded to 4.4.1, management console is broken

2014-10-03 Thread Ian Young
It looks like this is the root of the problem:

2014-10-03 13:44:47,131 DEBUG [c.c.u.d.Upgrade440to441] (main:null)
Updating System Vm template IDs
2014-10-03 13:44:47,136 DEBUG [c.c.u.d.Upgrade440to441] (main:null)
Updating LXC System Vms
2014-10-03 13:44:47,137 WARN  [c.c.u.d.Upgrade440to441] (main:null) 4.4.0
LXC SystemVm template not found. LXC hypervisor is not used, so not failing
upgrade
2014-10-03 13:44:47,138 DEBUG [c.c.u.d.Upgrade440to441] (main:null)
Updating KVM System Vms
2014-10-03 13:44:47,141 ERROR [c.c.u.DatabaseUpgradeChecker] (main:null)
Unable to upgrade the database
com.cloud.utils.exception.CloudRuntimeException: 4.4.0 KVM SystemVm
template not found. Cannot upgrade system Vms

Any ideas about how I can fix this?  I had a 4.4.0 KVM SystemVm template
prior to the upgrade.

On Fri, Oct 3, 2014 at 1:39 PM, Ian Young  wrote:

> I tried upgrading from 4.4.0 to 4.4.1 but now I get a 404 error when I try
> to access the management console.  The localhost.2014-10-03.log shows this
> error:
>
> Caused by: java.io.IOException: Resource
> [jar:file:/usr/share/cloudstack-management/webapps/client/WEB-INF/lib/cloud-api-4.4.1.jar!/META-INF/cloudstack/api-config/module.properties]
> and
> [jar:file:/usr/share/cloudstack-management/webapps/client/WEB-INF/lib/cloud-api-4.4.0.jar!/META-INF/cloudstack/api-config/module.properties]
> do not appear to be the same resource, please ensure the name property is
> correct or that the module is not defined twice
>
> Where is this value defined?  The full log can be viewed here:
>
> pastebin.com/nrdEsxZK
>
> I also noticed the error "4.4.0 KVM SystemVm template not found. Cannot
> upgrade system Vms."  The URL for the system VM template in the 4.4.1
> upgrade instructions is the same as the one I used when I installed 4.4.0
> initially.  Is there really a need to install the same template again?
>


upgraded to 4.4.1, management console is broken

2014-10-03 Thread Ian Young
I tried upgrading from 4.4.0 to 4.4.1 but now I get a 404 error when I try
to access the management console.  The localhost.2014-10-03.log shows this
error:

Caused by: java.io.IOException: Resource
[jar:file:/usr/share/cloudstack-management/webapps/client/WEB-INF/lib/cloud-api-4.4.1.jar!/META-INF/cloudstack/api-config/module.properties]
and
[jar:file:/usr/share/cloudstack-management/webapps/client/WEB-INF/lib/cloud-api-4.4.0.jar!/META-INF/cloudstack/api-config/module.properties]
do not appear to be the same resource, please ensure the name property is
correct or that the module is not defined twice

Where is this value defined?  The full log can be viewed here:

pastebin.com/nrdEsxZK

I also noticed the error "4.4.0 KVM SystemVm template not found. Cannot
upgrade system Vms."  The URL for the system VM template in the 4.4.1
upgrade instructions is the same as the one I used when I installed 4.4.0
initially.  Is there really a need to install the same template again?


system capacity not updating

2014-10-03 Thread Ian Young
I set some new over provisioning values, stopped all running instances, and
restarted the management service.  When I logged back in, the system
capacity has not changed.  All instances are stopped, yet the dashboard
still reports the same resource usage as before I shut them down.  How do I
refresh this information?


Re: basic zone setup

2014-07-31 Thread Ian Young
Do you know which MySQL tables need to be updated to reference the new
template?  I'm worried that if I miss one the system will break
unexpectedly the next time I launch a system VM.  It might be worthwhile
for me to simply reinstall the entire thing to be certain everything's set
up correctly.


On Thu, Jul 31, 2014 at 12:31 AM, Erik Weber  wrote:

> Yes, if you don't want to reinstall/re-seed the system vm template, you
> should also download the new ones and do the mysql queries so that it is
> used for any future system vm deployments.
>
>
> Erik
>
>
> On Thu, Jul 31, 2014 at 3:17 AM, Ian Young  wrote:
>
> > Yes, that makes the ssvm-check pass all the tests.  Thanks.  Should I
> > repeat that upgrade with the console proxy?
> >
> >
> > On Wed, Jul 30, 2014 at 6:09 PM, Carlos Reategui 
> > wrote:
> >
> > > There have been some messages going around about the template needing a
> > fix
> > > for this.
> > >
> > > Per this link (https://gist.github.com/terbolous/102ae8edd1cda192561c)
> > > from
> > > one of the messages you can try the following on the ssvm itself:
> > >
> > > apt-get update && apt-get -y install openjdk-7-jre-headless
> > > openjdk-7-jre-lib && apt-get -y remove openjdk-6-jre-headless
> > >
> > > then you may also need to:
> > > service cloud stop && sleep 3 && service cloud start
> > >
> > > try the ssvm-check again after that.
> > >
> > > My understanding is that this is not a permanent fix, but should get
> you
> > > going for now.
> > >
> > >
> > > On Wed, Jul 30, 2014 at 6:00 PM, Ian Young 
> > wrote:
> > >
> > > > I found this in the cloud.out log in the SSVM:
> > > >
> > > > Exception in thread "main" java.lang.UnsupportedClassVersionError:
> > > > com/cloud/agent/AgentShell : Unsupported major.minor version 51.0
> > > > at java.lang.ClassLoader.defineClass1(Native Method)
> > > > at java.lang.ClassLoader.defineClass(ClassLoader.java:634)
> > > > at
> > > java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
> > > > at java.net.URLClassLoader.defineClass(URLClassLoader.java:277)
> > > > at java.net.URLClassLoader.access$000(URLClassLoader.java:73)
> > > > at java.net.URLClassLoader$1.run(URLClassLoader.java:212)
> > > > at java.security.AccessController.doPrivileged(Native Method)
> > > > at java.net.URLClassLoader.findClass(URLClassLoader.java:205)
> > > > at java.lang.ClassLoader.loadClass(ClassLoader.java:321)
> > > > at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:294)
> > > > at java.lang.ClassLoader.loadClass(ClassLoader.java:266)
> > > > Could not find the main class: com.cloud.agent.AgentShell. Program
> will
> > > > exit.
> > > >
> > > > It seems to have to do with a Java version mismatch.  I'm using JDK 7
> > on
> > > > both the management server and hypervisor but the SSVM is using
> version
> > > 6.
> > > >  Is this the most current system VM template?
> > > >
> > > >
> > > >
> > >
> >
> http://download.cloud.com/templates/4.3/systemvm64template-2014-01-14-master-kvm.qcow2.bz2
> > > >
> > > >
> > > > On Wed, Jul 30, 2014 at 5:25 PM, Ian Young 
> > > wrote:
> > > >
> > > > > root@s-2-VM:~# /usr/local/cloud/systemvm/ssvm-check.sh
> > > > > 
> > > > > First DNS server is  192.168.100.2
> > > > > PING 192.168.100.2 (192.168.100.2): 48 data bytes
> > > > > 56 bytes from 192.168.100.2: icmp_seq=0 ttl=64 time=1.227 ms
> > > > > 56 bytes from 192.168.100.2: icmp_seq=1 ttl=64 time=0.882 ms
> > > > > --- 192.168.100.2 ping statistics ---
> > > > > 2 packets transmitted, 2 packets received, 0% packet loss
> > > > > round-trip min/avg/max/stddev = 0.882/1.054/1.227/0.173 ms
> > > > > Good: Can ping DNS server
> > > > > 
> > > > > Good: DNS resolves download.cloud.com
> > > > > 
> > > > > ERROR: NFS is not currently mounted
> > > > > Try manually mounting from inside the VM
> > > > > NFS server is  169.254.1.0
> > > > > PING 169.254.1.0

Re: basic zone setup

2014-07-30 Thread Ian Young
Yes, that makes the ssvm-check pass all the tests.  Thanks.  Should I
repeat that upgrade with the console proxy?


On Wed, Jul 30, 2014 at 6:09 PM, Carlos Reategui 
wrote:

> There have been some messages going around about the template needing a fix
> for this.
>
> Per this link (https://gist.github.com/terbolous/102ae8edd1cda192561c)
> from
> one of the messages you can try the following on the ssvm itself:
>
> apt-get update && apt-get -y install openjdk-7-jre-headless
> openjdk-7-jre-lib && apt-get -y remove openjdk-6-jre-headless
>
> then you may also need to:
> service cloud stop && sleep 3 && service cloud start
>
> try the ssvm-check again after that.
>
> My understanding is that this is not a permanent fix, but should get you
> going for now.
>
>
> On Wed, Jul 30, 2014 at 6:00 PM, Ian Young  wrote:
>
> > I found this in the cloud.out log in the SSVM:
> >
> > Exception in thread "main" java.lang.UnsupportedClassVersionError:
> > com/cloud/agent/AgentShell : Unsupported major.minor version 51.0
> > at java.lang.ClassLoader.defineClass1(Native Method)
> > at java.lang.ClassLoader.defineClass(ClassLoader.java:634)
> > at
> java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
> > at java.net.URLClassLoader.defineClass(URLClassLoader.java:277)
> > at java.net.URLClassLoader.access$000(URLClassLoader.java:73)
> > at java.net.URLClassLoader$1.run(URLClassLoader.java:212)
> > at java.security.AccessController.doPrivileged(Native Method)
> > at java.net.URLClassLoader.findClass(URLClassLoader.java:205)
> > at java.lang.ClassLoader.loadClass(ClassLoader.java:321)
> > at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:294)
> > at java.lang.ClassLoader.loadClass(ClassLoader.java:266)
> > Could not find the main class: com.cloud.agent.AgentShell. Program will
> > exit.
> >
> > It seems to have to do with a Java version mismatch.  I'm using JDK 7 on
> > both the management server and hypervisor but the SSVM is using version
> 6.
> >  Is this the most current system VM template?
> >
> >
> >
> http://download.cloud.com/templates/4.3/systemvm64template-2014-01-14-master-kvm.qcow2.bz2
> >
> >
> > On Wed, Jul 30, 2014 at 5:25 PM, Ian Young 
> wrote:
> >
> > > root@s-2-VM:~# /usr/local/cloud/systemvm/ssvm-check.sh
> > > 
> > > First DNS server is  192.168.100.2
> > > PING 192.168.100.2 (192.168.100.2): 48 data bytes
> > > 56 bytes from 192.168.100.2: icmp_seq=0 ttl=64 time=1.227 ms
> > > 56 bytes from 192.168.100.2: icmp_seq=1 ttl=64 time=0.882 ms
> > > --- 192.168.100.2 ping statistics ---
> > > 2 packets transmitted, 2 packets received, 0% packet loss
> > > round-trip min/avg/max/stddev = 0.882/1.054/1.227/0.173 ms
> > > Good: Can ping DNS server
> > > 
> > > Good: DNS resolves download.cloud.com
> > > 
> > > ERROR: NFS is not currently mounted
> > > Try manually mounting from inside the VM
> > > NFS server is  169.254.1.0
> > > PING 169.254.1.0 (169.254.1.0): 48 data bytes
> > > 56 bytes from 169.254.1.0: icmp_seq=0 ttl=64 time=0.067 ms
> > > 56 bytes from 169.254.1.0: icmp_seq=1 ttl=64 time=0.039 ms
> > > --- 169.254.1.0 ping statistics ---
> > > 2 packets transmitted, 2 packets received, 0% packet loss
> > > round-trip min/avg/max/stddev = 0.039/0.053/0.067/0.000 ms
> > > Good: Can ping nfs server
> > > 
> > > Management server is 192.168.101.3. Checking connectivity.
> > > Good: Can connect to management server port 8250
> > > 
> > > ERROR: Java process not running.  Try restarting the SSVM.
> > >
> > > It says the NFS server is 169.254.1.0 which is the SSVM's link local
> > > address.  How did it decide that?  During the zone configuration I
> > > specified "virthost1.lax.ratespecial.com" as the NFS server and that
> > > resolves to 192.168.101.4.  Also, in what path does it expect the NFS
> > > volume to be mounted?
> > >
> > >
> > > On Wed, Jul 30, 2014 at 4:53 PM, Carlos Reategui 
> > > wrote:
> > >
> > >> If the template is not ready then your ssvm may be having problems
> > >> downloading it.  Have you followed the info here:
> > >>
> > >&g

Re: basic zone setup

2014-07-30 Thread Ian Young
I found this in the cloud.out log in the SSVM:

Exception in thread "main" java.lang.UnsupportedClassVersionError:
com/cloud/agent/AgentShell : Unsupported major.minor version 51.0
at java.lang.ClassLoader.defineClass1(Native Method)
at java.lang.ClassLoader.defineClass(ClassLoader.java:634)
at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
at java.net.URLClassLoader.defineClass(URLClassLoader.java:277)
at java.net.URLClassLoader.access$000(URLClassLoader.java:73)
at java.net.URLClassLoader$1.run(URLClassLoader.java:212)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:205)
at java.lang.ClassLoader.loadClass(ClassLoader.java:321)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:294)
at java.lang.ClassLoader.loadClass(ClassLoader.java:266)
Could not find the main class: com.cloud.agent.AgentShell. Program will
exit.

It seems to have to do with a Java version mismatch.  I'm using JDK 7 on
both the management server and hypervisor but the SSVM is using version 6.
 Is this the most current system VM template?

http://download.cloud.com/templates/4.3/systemvm64template-2014-01-14-master-kvm.qcow2.bz2


On Wed, Jul 30, 2014 at 5:25 PM, Ian Young  wrote:

> root@s-2-VM:~# /usr/local/cloud/systemvm/ssvm-check.sh
> 
> First DNS server is  192.168.100.2
> PING 192.168.100.2 (192.168.100.2): 48 data bytes
> 56 bytes from 192.168.100.2: icmp_seq=0 ttl=64 time=1.227 ms
> 56 bytes from 192.168.100.2: icmp_seq=1 ttl=64 time=0.882 ms
> --- 192.168.100.2 ping statistics ---
> 2 packets transmitted, 2 packets received, 0% packet loss
> round-trip min/avg/max/stddev = 0.882/1.054/1.227/0.173 ms
> Good: Can ping DNS server
> 
> Good: DNS resolves download.cloud.com
> 
> ERROR: NFS is not currently mounted
> Try manually mounting from inside the VM
> NFS server is  169.254.1.0
> PING 169.254.1.0 (169.254.1.0): 48 data bytes
> 56 bytes from 169.254.1.0: icmp_seq=0 ttl=64 time=0.067 ms
> 56 bytes from 169.254.1.0: icmp_seq=1 ttl=64 time=0.039 ms
> --- 169.254.1.0 ping statistics ---
> 2 packets transmitted, 2 packets received, 0% packet loss
> round-trip min/avg/max/stddev = 0.039/0.053/0.067/0.000 ms
> Good: Can ping nfs server
> 
> Management server is 192.168.101.3. Checking connectivity.
> Good: Can connect to management server port 8250
> 
> ERROR: Java process not running.  Try restarting the SSVM.
>
> It says the NFS server is 169.254.1.0 which is the SSVM's link local
> address.  How did it decide that?  During the zone configuration I
> specified "virthost1.lax.ratespecial.com" as the NFS server and that
> resolves to 192.168.101.4.  Also, in what path does it expect the NFS
> volume to be mounted?
>
>
> On Wed, Jul 30, 2014 at 4:53 PM, Carlos Reategui 
> wrote:
>
>> If the template is not ready then your ssvm may be having problems
>> downloading it.  Have you followed the info here:
>>
>> https://cwiki.apache.org/confluence/display/CLOUDSTACK/SSVM,+templates,+Secondary+storage+troubleshooting
>> to make sure that the ssvm is actually working properly?
>>
>>
>> On Wed, Jul 30, 2014 at 4:39 PM, Ian Young 
>> wrote:
>>
>> > I've reinstalled CloudStack 4.4 again, configuring the network as
>> follows:
>> >
>> > management server:
>> > p4p1 http://pastebin.com/skMXxVtk
>> >
>> > hypervisor/storage server:
>> > eth0 http://pastebin.com/LxUxFdpe
>> > eth1 http://pastebin.com/K5si1L4d
>> > cloudbr0 http://pastebin.com/heZGQfVs (management/storage traffic)
>> > cloudbr1 http://pastebin.com/nprB2Kx7 (guest traffic)
>> >
>> > When I logged into the management GUI for the first time, I skipped the
>> > wizard and went straight to the dashboard.  There, I set up a basic
>> zone as
>> > follows:
>> >
>> > http://imgur.com/a/R1dX0#0
>> >
>> > Now that the infrastructure has been launched and the SSVM and console
>> > proxy are running, I noticed that the CentOS template is not ready.
>> >  Neither the management server or the hypervisor are downloading
>> anything,
>> > so it doesn't appear the CentOS template will be ready.  If I try to
>> > register my own templates, I fill out all the fields but the window just
>> > disappears when I click OK and no template is added.  I don't see any
>> new
>> > messages in the management server l

Re: basic zone setup

2014-07-30 Thread Ian Young
root@s-2-VM:~# /usr/local/cloud/systemvm/ssvm-check.sh

First DNS server is  192.168.100.2
PING 192.168.100.2 (192.168.100.2): 48 data bytes
56 bytes from 192.168.100.2: icmp_seq=0 ttl=64 time=1.227 ms
56 bytes from 192.168.100.2: icmp_seq=1 ttl=64 time=0.882 ms
--- 192.168.100.2 ping statistics ---
2 packets transmitted, 2 packets received, 0% packet loss
round-trip min/avg/max/stddev = 0.882/1.054/1.227/0.173 ms
Good: Can ping DNS server

Good: DNS resolves download.cloud.com

ERROR: NFS is not currently mounted
Try manually mounting from inside the VM
NFS server is  169.254.1.0
PING 169.254.1.0 (169.254.1.0): 48 data bytes
56 bytes from 169.254.1.0: icmp_seq=0 ttl=64 time=0.067 ms
56 bytes from 169.254.1.0: icmp_seq=1 ttl=64 time=0.039 ms
--- 169.254.1.0 ping statistics ---
2 packets transmitted, 2 packets received, 0% packet loss
round-trip min/avg/max/stddev = 0.039/0.053/0.067/0.000 ms
Good: Can ping nfs server

Management server is 192.168.101.3. Checking connectivity.
Good: Can connect to management server port 8250

ERROR: Java process not running.  Try restarting the SSVM.

It says the NFS server is 169.254.1.0 which is the SSVM's link local
address.  How did it decide that?  During the zone configuration I
specified "virthost1.lax.ratespecial.com" as the NFS server and that
resolves to 192.168.101.4.  Also, in what path does it expect the NFS
volume to be mounted?


On Wed, Jul 30, 2014 at 4:53 PM, Carlos Reategui 
wrote:

> If the template is not ready then your ssvm may be having problems
> downloading it.  Have you followed the info here:
>
> https://cwiki.apache.org/confluence/display/CLOUDSTACK/SSVM,+templates,+Secondary+storage+troubleshooting
> to make sure that the ssvm is actually working properly?
>
>
> On Wed, Jul 30, 2014 at 4:39 PM, Ian Young  wrote:
>
> > I've reinstalled CloudStack 4.4 again, configuring the network as
> follows:
> >
> > management server:
> > p4p1 http://pastebin.com/skMXxVtk
> >
> > hypervisor/storage server:
> > eth0 http://pastebin.com/LxUxFdpe
> > eth1 http://pastebin.com/K5si1L4d
> > cloudbr0 http://pastebin.com/heZGQfVs (management/storage traffic)
> > cloudbr1 http://pastebin.com/nprB2Kx7 (guest traffic)
> >
> > When I logged into the management GUI for the first time, I skipped the
> > wizard and went straight to the dashboard.  There, I set up a basic zone
> as
> > follows:
> >
> > http://imgur.com/a/R1dX0#0
> >
> > Now that the infrastructure has been launched and the SSVM and console
> > proxy are running, I noticed that the CentOS template is not ready.
> >  Neither the management server or the hypervisor are downloading
> anything,
> > so it doesn't appear the CentOS template will be ready.  If I try to
> > register my own templates, I fill out all the fields but the window just
> > disappears when I click OK and no template is added.  I don't see any new
> > messages in the management server log at the time this occurs.  I suspect
> > there is a storage problem.  However, I can mount the NFS shares onto the
> > management server with no problems.  That's how I was able to manually
> > download the system VM template, as the installation guide indicated.
> >  What's wrong with this setup?  I don't see any obvious errors in the
> > management log besides these repetitive messages, which seem to
> contradict
> > the fact that there is a SSVM and console proxy running:
> >
> > http://pastebin.com/yvW5GmSB
> >
>


basic zone setup

2014-07-30 Thread Ian Young
I've reinstalled CloudStack 4.4 again, configuring the network as follows:

management server:
p4p1 http://pastebin.com/skMXxVtk

hypervisor/storage server:
eth0 http://pastebin.com/LxUxFdpe
eth1 http://pastebin.com/K5si1L4d
cloudbr0 http://pastebin.com/heZGQfVs (management/storage traffic)
cloudbr1 http://pastebin.com/nprB2Kx7 (guest traffic)

When I logged into the management GUI for the first time, I skipped the
wizard and went straight to the dashboard.  There, I set up a basic zone as
follows:

http://imgur.com/a/R1dX0#0

Now that the infrastructure has been launched and the SSVM and console
proxy are running, I noticed that the CentOS template is not ready.
 Neither the management server or the hypervisor are downloading anything,
so it doesn't appear the CentOS template will be ready.  If I try to
register my own templates, I fill out all the fields but the window just
disappears when I click OK and no template is added.  I don't see any new
messages in the management server log at the time this occurs.  I suspect
there is a storage problem.  However, I can mount the NFS shares onto the
management server with no problems.  That's how I was able to manually
download the system VM template, as the installation guide indicated.
 What's wrong with this setup?  I don't see any obvious errors in the
management log besides these repetitive messages, which seem to contradict
the fact that there is a SSVM and console proxy running:

http://pastebin.com/yvW5GmSB


Re: dual NIC VLAN configuration

2014-07-28 Thread Ian Young
Is private traffic the same thing as management/storage traffic?


On Fri, Jul 25, 2014 at 11:17 PM, Geoff Higginbottom <
geoff.higginbot...@shapeblue.com> wrote:

> Hi Ian,
>
> As you are deploying a Basic network there will be no public traffic.
>
> The private traffic, assuming you allocate an IP range to the POD which is
> in the same CIDR as the Management Server would typically be assigned to
> cloudbr0
>
> private.network.device=cloudbr0
>
> Guest traffic would then be assigned to cloudbr1
>
> guest.network.device=cloudbr1
>
>
>
> Regards
>
> Geoff Higginbottom
> CTO / Cloud Architect
>
> D: +44 20 3603 0542 | S: +44 20 3603 0540 +442036030540> | M: +447968161581
>
> geoff.higginbot...@shapeblue.com<mailto:geoff.higginbot...@shapeblue.com>
> | www.shapeblue.com | Twitter:@cloudstackguru<
> https://twitter.com/#!/cloudstackguru>
>
> ShapeBlue Ltd, 53 Chandos Place, Covent Garden, London, WC2N
> 4HS
>
>
> On 25 Jul 2014, at 19:18, "Ian Young"  iyo...@ratespecial.com>> wrote:
>
> So if management/storage traffic is on cloudbr0 and guest VMs are on
> cloudbr1, would these be the correct settings in agent.properties?
>
> guest.network.device=cloudbr1
> private.network.device=cloudbr1
> public.network.device=cloudbr1
>
>
> On Fri, Jul 25, 2014 at 10:11 AM, Ian Young  <mailto:iyo...@ratespecial.com>> wrote:
>
> Thank you, Geoff.  That was precisely the answer I was looking for.  I
> knew I was doing something wrong.  I didn't realize the second adapter
> could be used without an IP address explicitly assigned to it.  Yes, this
> is a basic zone (just an internal project so we don't need any public IP
> addresses).  I was planning to set up an NFS server on the
> 192.168.101.0/24 network so this is exactly what I was trying to
> accomplish.  Thanks.
>
>
> On Fri, Jul 25, 2014 at 1:34 AM, Geoff Higginbottom <
> geoff.higginbot...@shapeblue.com<mailto:geoff.higginbot...@shapeblue.com>>
> wrote:
>
> Ian,
>
> It looks like you are trying to setup a basic zone and have a Management
> Server on IP 192.168.101.3 and a Host on IP 192.168.101.4.
>
> The second interface on the host does not need any IP configuration on
> the Host as it will not be used by the Host so remove the 192.168.102.4
> mapping..  This interface will be used by the Guest VMs running on the Host
> who will have their own IP schema.
>
> Your Guest IP range will be in the 192.168.102.0/24 CIDR with a gateway
> of 192.168.102.1
>
> The Management Serve will talk to the Host via the 1st Interface, and
> Guest VMs will use the 2nd.
>
> You have not mentioned storage, but assuming you are using NFS for
> Primary and Secondary, put the NFS Server on the 192.168.101.0/24
> network, and then all storage traffic will also go over the 1st interface.
>
> Regards
>
> Geoff Higginbottom
>
> D: +44 20 3603 0542 | S: +44 20 3603 0540 | M: +447968161581
>
> geoff.higginbot...@shapeblue.com<mailto:geoff.higginbot...@shapeblue.com>
>
> -Original Message-
> From: Daan Hoogland [mailto:daan.hoogl...@gmail.com]
> Sent: 25 July 2014 08:47
> To: users@cloudstack.apache.org<mailto:users@cloudstack.apache.org>
> Subject: Re: dual NIC VLAN configuration
>
> Ian, I would imagine that guest traffic can't go out to the net this way.
> Maybe you should swap them. This is only guessing however. What are you
> seeing?
>
> On Fri, Jul 25, 2014 at 2:00 AM, Ian Young  iyo...@ratespecial.com>>
> wrote:
> Here's the less verbose version:  My hypervisor has two NICs and I've
> set up a label on each.  Traffic to and from cloudbr0 works perfectly.
> Traffic going into cloudbr1 goes out cloudbr0 because that interface
> has a default gateway.  Will this pose a problem when I try to set up
> separate management and guest networks in CloudStack?
>
>
> On Thu, Jul 24, 2014 at 10:56 AM, Ian Young  <mailto:iyo...@ratespecial.com>>
> wrote:
>
> I am trying to set up a server with two NICs as a hypervisor.  I
> would like to use the two interfaces to separate management and guest
> traffic, as recommended by the CloudStack installation guide.  This
> server is connected to a managed switch, which is connected to a
> hardware firewall, both of which are set up with tagged VLANs.  Some
> of the ports on the switch are designated as VLAN 6 and some are VLAN
> 7.  I've confirmed the VLANs are set up correctly by configuring eth0
> and eth1 (one at a time) with the appropriate IP address, netmask, and
> gateway.
>
> However, the difficulty arises when I try to configure both
> interfaces simultaneously.  The return traffic ten

Re: dual NIC VLAN configuration

2014-07-25 Thread Ian Young
So if management/storage traffic is on cloudbr0 and guest VMs are on
cloudbr1, would these be the correct settings in agent.properties?

guest.network.device=cloudbr1
private.network.device=cloudbr1
public.network.device=cloudbr1


On Fri, Jul 25, 2014 at 10:11 AM, Ian Young  wrote:

> Thank you, Geoff.  That was precisely the answer I was looking for.  I
> knew I was doing something wrong.  I didn't realize the second adapter
> could be used without an IP address explicitly assigned to it.  Yes, this
> is a basic zone (just an internal project so we don't need any public IP
> addresses).  I was planning to set up an NFS server on the
> 192.168.101.0/24 network so this is exactly what I was trying to
> accomplish.  Thanks.
>
>
> On Fri, Jul 25, 2014 at 1:34 AM, Geoff Higginbottom <
> geoff.higginbot...@shapeblue.com> wrote:
>
>> Ian,
>>
>> It looks like you are trying to setup a basic zone and have a Management
>> Server on IP 192.168.101.3 and a Host on IP 192.168.101.4.
>>
>> The second interface on the host does not need any IP configuration on
>> the Host as it will not be used by the Host so remove the 192.168.102.4
>> mapping..  This interface will be used by the Guest VMs running on the Host
>> who will have their own IP schema.
>>
>> Your Guest IP range will be in the 192.168.102.0/24 CIDR with a gateway
>> of 192.168.102.1
>>
>> The Management Serve will talk to the Host via the 1st Interface, and
>> Guest VMs will use the 2nd.
>>
>> You have not mentioned storage, but assuming you are using NFS for
>> Primary and Secondary, put the NFS Server on the 192.168.101.0/24
>> network, and then all storage traffic will also go over the 1st interface.
>>
>> Regards
>>
>> Geoff Higginbottom
>>
>> D: +44 20 3603 0542 | S: +44 20 3603 0540 | M: +447968161581
>>
>> geoff.higginbot...@shapeblue.com
>>
>> -Original Message-
>> From: Daan Hoogland [mailto:daan.hoogl...@gmail.com]
>> Sent: 25 July 2014 08:47
>> To: users@cloudstack.apache.org
>> Subject: Re: dual NIC VLAN configuration
>>
>> Ian, I would imagine that guest traffic can't go out to the net this way.
>> Maybe you should swap them. This is only guessing however. What are you
>> seeing?
>>
>> On Fri, Jul 25, 2014 at 2:00 AM, Ian Young 
>> wrote:
>> > Here's the less verbose version:  My hypervisor has two NICs and I've
>> > set up a label on each.  Traffic to and from cloudbr0 works perfectly.
>> > Traffic going into cloudbr1 goes out cloudbr0 because that interface
>> > has a default gateway.  Will this pose a problem when I try to set up
>> > separate management and guest networks in CloudStack?
>> >
>> >
>> > On Thu, Jul 24, 2014 at 10:56 AM, Ian Young 
>> wrote:
>> >
>> >> I am trying to set up a server with two NICs as a hypervisor.  I
>> >> would like to use the two interfaces to separate management and guest
>> >> traffic, as recommended by the CloudStack installation guide.  This
>> >> server is connected to a managed switch, which is connected to a
>> >> hardware firewall, both of which are set up with tagged VLANs.  Some
>> >> of the ports on the switch are designated as VLAN 6 and some are VLAN
>> >> 7.  I've confirmed the VLANs are set up correctly by configuring eth0
>> >> and eth1 (one at a time) with the appropriate IP address, netmask, and
>> gateway.
>> >>
>> >> However, the difficulty arises when I try to configure both
>> >> interfaces simultaneously.  The return traffic tends to go out
>> >> whichever interface is associated with the default gateway, a typical
>> >> issue when using multiple network interfaces.  I've followed numerous
>> >> guides, which all basically say the same thing:  Don't set a default
>> >> gateway; use iproute2 to control the flow of traffic with route-eth0,
>> >> rule-eth0, and rt_tables.  I've tried setting this up numerous times
>> >> to no avail, probably because the guides I'm reading don't involve
>> >> VLANs.  Add to that the the cloudbr0 and cloudbr1 bridges that
>> >> CloudStack requires and now I'm really confused as to how to set up
>> >> the network.  I can't be the first person to have set up CloudStack
>> >> this way; it sounds pretty common.  Can someone explain to me the
>> correct way to configure these interfaces?
>> >>
>> >> Here is my network information:
>> >&g

Re: dual NIC VLAN configuration

2014-07-25 Thread Ian Young
Thank you, Geoff.  That was precisely the answer I was looking for.  I knew
I was doing something wrong.  I didn't realize the second adapter could be
used without an IP address explicitly assigned to it.  Yes, this is a basic
zone (just an internal project so we don't need any public IP addresses).
 I was planning to set up an NFS server on the 192.168.101.0/24 network so
this is exactly what I was trying to accomplish.  Thanks.


On Fri, Jul 25, 2014 at 1:34 AM, Geoff Higginbottom <
geoff.higginbot...@shapeblue.com> wrote:

> Ian,
>
> It looks like you are trying to setup a basic zone and have a Management
> Server on IP 192.168.101.3 and a Host on IP 192.168.101.4.
>
> The second interface on the host does not need any IP configuration on the
> Host as it will not be used by the Host so remove the 192.168.102.4
> mapping..  This interface will be used by the Guest VMs running on the Host
> who will have their own IP schema.
>
> Your Guest IP range will be in the 192.168.102.0/24 CIDR with a gateway
> of 192.168.102.1
>
> The Management Serve will talk to the Host via the 1st Interface, and
> Guest VMs will use the 2nd.
>
> You have not mentioned storage, but assuming you are using NFS for Primary
> and Secondary, put the NFS Server on the 192.168.101.0/24 network, and
> then all storage traffic will also go over the 1st interface.
>
> Regards
>
> Geoff Higginbottom
>
> D: +44 20 3603 0542 | S: +44 20 3603 0540 | M: +447968161581
>
> geoff.higginbot...@shapeblue.com
>
> -Original Message-
> From: Daan Hoogland [mailto:daan.hoogl...@gmail.com]
> Sent: 25 July 2014 08:47
> To: users@cloudstack.apache.org
> Subject: Re: dual NIC VLAN configuration
>
> Ian, I would imagine that guest traffic can't go out to the net this way.
> Maybe you should swap them. This is only guessing however. What are you
> seeing?
>
> On Fri, Jul 25, 2014 at 2:00 AM, Ian Young  wrote:
> > Here's the less verbose version:  My hypervisor has two NICs and I've
> > set up a label on each.  Traffic to and from cloudbr0 works perfectly.
> > Traffic going into cloudbr1 goes out cloudbr0 because that interface
> > has a default gateway.  Will this pose a problem when I try to set up
> > separate management and guest networks in CloudStack?
> >
> >
> > On Thu, Jul 24, 2014 at 10:56 AM, Ian Young 
> wrote:
> >
> >> I am trying to set up a server with two NICs as a hypervisor.  I
> >> would like to use the two interfaces to separate management and guest
> >> traffic, as recommended by the CloudStack installation guide.  This
> >> server is connected to a managed switch, which is connected to a
> >> hardware firewall, both of which are set up with tagged VLANs.  Some
> >> of the ports on the switch are designated as VLAN 6 and some are VLAN
> >> 7.  I've confirmed the VLANs are set up correctly by configuring eth0
> >> and eth1 (one at a time) with the appropriate IP address, netmask, and
> gateway.
> >>
> >> However, the difficulty arises when I try to configure both
> >> interfaces simultaneously.  The return traffic tends to go out
> >> whichever interface is associated with the default gateway, a typical
> >> issue when using multiple network interfaces.  I've followed numerous
> >> guides, which all basically say the same thing:  Don't set a default
> >> gateway; use iproute2 to control the flow of traffic with route-eth0,
> >> rule-eth0, and rt_tables.  I've tried setting this up numerous times
> >> to no avail, probably because the guides I'm reading don't involve
> >> VLANs.  Add to that the the cloudbr0 and cloudbr1 bridges that
> >> CloudStack requires and now I'm really confused as to how to set up
> >> the network.  I can't be the first person to have set up CloudStack
> >> this way; it sounds pretty common.  Can someone explain to me the
> correct way to configure these interfaces?
> >>
> >> Here is my network information:
> >>
> >> VLAN 6 (management)
> >> 192.168.101.0/24
> >> gateway: 192.168.101.1
> >>
> >> VLAN 7 (guest)
> >> 192.168.102.0/24
> >> gateway: 192.168.102.1
> >>
> >> current hypervisor settings:
> >> eth0: 192.168.101.4
> >> eth1: 192.168.102.4
> >>
> >> current management server settings (this is a separate machine):
> >> p4p1: 192.168.101.3
> >>
>
>
>
> --
> Daan
> Find out more about ShapeBlue and our range of CloudStack related services
>
> IaaS Cloud Design & Build<
> http://

Re: dual NIC VLAN configuration

2014-07-24 Thread Ian Young
Here's the less verbose version:  My hypervisor has two NICs and I've set
up a label on each.  Traffic to and from cloudbr0 works perfectly.  Traffic
going into cloudbr1 goes out cloudbr0 because that interface has a default
gateway.  Will this pose a problem when I try to set up separate management
and guest networks in CloudStack?


On Thu, Jul 24, 2014 at 10:56 AM, Ian Young  wrote:

> I am trying to set up a server with two NICs as a hypervisor.  I would
> like to use the two interfaces to separate management and guest traffic, as
> recommended by the CloudStack installation guide.  This server is connected
> to a managed switch, which is connected to a hardware firewall, both of
> which are set up with tagged VLANs.  Some of the ports on the switch are
> designated as VLAN 6 and some are VLAN 7.  I've confirmed the VLANs are set
> up correctly by configuring eth0 and eth1 (one at a time) with the
> appropriate IP address, netmask, and gateway.
>
> However, the difficulty arises when I try to configure both interfaces
> simultaneously.  The return traffic tends to go out whichever interface is
> associated with the default gateway, a typical issue when using multiple
> network interfaces.  I've followed numerous guides, which all basically say
> the same thing:  Don't set a default gateway; use iproute2 to control the
> flow of traffic with route-eth0, rule-eth0, and rt_tables.  I've tried
> setting this up numerous times to no avail, probably because the guides I'm
> reading don't involve VLANs.  Add to that the the cloudbr0 and cloudbr1
> bridges that CloudStack requires and now I'm really confused as to how to
> set up the network.  I can't be the first person to have set up CloudStack
> this way; it sounds pretty common.  Can someone explain to me the correct
> way to configure these interfaces?
>
> Here is my network information:
>
> VLAN 6 (management)
> 192.168.101.0/24
> gateway: 192.168.101.1
>
> VLAN 7 (guest)
> 192.168.102.0/24
> gateway: 192.168.102.1
>
> current hypervisor settings:
> eth0: 192.168.101.4
> eth1: 192.168.102.4
>
> current management server settings (this is a separate machine):
> p4p1: 192.168.101.3
>


dual NIC VLAN configuration

2014-07-24 Thread Ian Young
I am trying to set up a server with two NICs as a hypervisor.  I would like
to use the two interfaces to separate management and guest traffic, as
recommended by the CloudStack installation guide.  This server is connected
to a managed switch, which is connected to a hardware firewall, both of
which are set up with tagged VLANs.  Some of the ports on the switch are
designated as VLAN 6 and some are VLAN 7.  I've confirmed the VLANs are set
up correctly by configuring eth0 and eth1 (one at a time) with the
appropriate IP address, netmask, and gateway.

However, the difficulty arises when I try to configure both interfaces
simultaneously.  The return traffic tends to go out whichever interface is
associated with the default gateway, a typical issue when using multiple
network interfaces.  I've followed numerous guides, which all basically say
the same thing:  Don't set a default gateway; use iproute2 to control the
flow of traffic with route-eth0, rule-eth0, and rt_tables.  I've tried
setting this up numerous times to no avail, probably because the guides I'm
reading don't involve VLANs.  Add to that the the cloudbr0 and cloudbr1
bridges that CloudStack requires and now I'm really confused as to how to
set up the network.  I can't be the first person to have set up CloudStack
this way; it sounds pretty common.  Can someone explain to me the correct
way to configure these interfaces?

Here is my network information:

VLAN 6 (management)
192.168.101.0/24
gateway: 192.168.101.1

VLAN 7 (guest)
192.168.102.0/24
gateway: 192.168.102.1

current hypervisor settings:
eth0: 192.168.101.4
eth1: 192.168.102.4

current management server settings (this is a separate machine):
p4p1: 192.168.101.3


local storage for system VMs

2014-05-23 Thread Ian Young
My CloudStack 4.3 system is a single server (for the time being, at least).
 Since a system VM malfunction is a show-stopper, I would like to host
those on local storage to avoid issues with NFS mounts.  I have changed the
value system.vm.use.local.storage to true.  I don't see an option for
use.local.storage, so maybe that's been removed.  The system offerings for
SSVM, console proxy, and software router are now set to Storage Type =
local.

Do I need to create new compute offerings, or is that for regular
instances?  I want to keep normal instances on shared storage.

How do I make sure the system VMs are running on local storage?  I've
restarted them but the qemu process still says

-drive
file=/mnt/2a7ec307-d797-3287-aa31-7e280afb56cf/d8668fbc-dd3b-4c85-952e-40947eda7b99,if=none,id=drive-virtio-disk0,format=qcow2,cache=none"

which is a shared volume.  Do I need to destroy them and create new ones?


Re: console proxy times out

2014-05-23 Thread Ian Young
I'm still getting a lot of these, though.

2014-05-23 10:52:27,908 INFO  [c.c.h.HighAvailabilityManagerImpl]
(HA-Worker-3:ctx-838a6dc4 work-873) HA on VM[ConsoleProxy|v-2-VM]
2014-05-23 10:52:27,908 INFO  [c.c.h.HighAvailabilityManagerImpl]
(HA-Worker-3:ctx-838a6dc4 work-873) VM VM[ConsoleProxy|v-2-VM] has been
changed.  Current State = Running Previous State = Starting last updated =
571 previous updated = 568
2014-05-23 10:52:27,908 INFO  [c.c.h.HighAvailabilityManagerImpl]
(HA-Worker-3:ctx-838a6dc4 work-873) Completed
HAWork[873-HA-2-Starting-Investigating]


On Fri, May 23, 2014 at 10:50 AM, Ian Young  wrote:

> I destroyed the SSVM and then tried hacking the database to make
> CloudStack realize that the console proxy is in fact stopped.
>
> mysql> update vm_instance set state='Stopped' where name='v-2-VM';
> mysql> update host set status='Up' where name='v-2-VM';
>
> Now they're both running and I can see the console.  There's got to be a
> better way to use this system without having to reboot or hack the database
> daily.
>
>
> On Fri, May 23, 2014 at 10:42 AM, Ian Young wrote:
>
>> Also, is this normal?  Every time the server is rebooted, it adds another
>> record to the mshost table but the "removed" field is always NULL.
>>
>> http://pastebin.com/q5zDCu4b
>>
>>
>> On Fri, May 23, 2014 at 10:39 AM, Ian Young wrote:
>>
>>> The SSVM is stopped.  If I try to start it, it complains about
>>> insufficient capacity.  CPU?  RAM?  I have plenty of both available.
>>>
>>> 2014-05-23 10:36:51,196 DEBUG [c.c.d.FirstFitPlanner]
>>> (Job-Executor-2:ctx-ababac38 ctx-0906d3c3) Listing clusters in order of
>>> aggregate capacity, that have (atleast one host with) enough CPU and RAM
>>> capacity under this Pod: 1
>>> 2014-05-23 10:36:51,198 DEBUG [c.c.d.FirstFitPlanner]
>>> (Job-Executor-2:ctx-ababac38 ctx-0906d3c3) Removing from the clusterId list
>>> these clusters from avoid set: [1]
>>> 2014-05-23 10:36:51,198 DEBUG [c.c.d.FirstFitPlanner]
>>> (Job-Executor-2:ctx-ababac38 ctx-0906d3c3) No clusters found after removing
>>> disabled clusters and clusters in avoid list, returning.
>>> 2014-05-23 10:36:51,201 DEBUG [c.c.c.CapacityManagerImpl]
>>> (Job-Executor-2:ctx-ababac38 ctx-0906d3c3) VM state transitted from
>>> :Starting to Stopped with event: OperationFailedvm's original host id: 1
>>> new host id: null host id before state transition: null
>>> 2014-05-23 10:36:51,201 WARN  [c.c.s.s.SecondaryStorageManagerImpl]
>>> (Job-Executor-2:ctx-ababac38 ctx-0906d3c3) Exception while trying to start
>>> secondary storage vm
>>> com.cloud.exception.InsufficientServerCapacityException: Unable to
>>> create a deployment for VM[SecondaryStorageVm|s-1-VM]Scope=interface
>>> com.cloud.dc.DataCenter; id=1
>>>
>>>
>>> On Fri, May 23, 2014 at 10:35 AM, Ian Young wrote:
>>>
>>>> I rebooted it and now it's in an even more broken state.  It's
>>>> repeatedly trying to stop the console proxy but can't because its state is
>>>> "Starting."  Here is an excerpt from the management log:
>>>>
>>>> http://pastebin.com/FiaDzKXb
>>>>
>>>> The agent log keeps repeating these messages:
>>>>
>>>> http://pastebin.com/yDidSbrz
>>>>
>>>> What's wrong with it?
>>>>
>>>>
>>>> On Thu, May 22, 2014 at 12:55 PM, Ian Young wrote:
>>>>
>>>>> I wonder if something is wrong with the NFS mount.  I see this error
>>>>> periodically in /var/log/messages even though I have set the Domain in
>>>>> /etc/idmapd.conf to the host's FQDN:
>>>>>
>>>>> May 20 19:30:22 virthost1 rpc.idmapd[1790]: nss_getpwnam: name '0'
>>>>> does not map into domain 'redacted.com'
>>>>> May 20 19:36:20 virthost1 rpc.idmapd[1731]: nss_getpwnam: name '0'
>>>>> does not map into domain 'redacted.com'
>>>>> May 20 19:44:35 virthost1 rpc.idmapd[1731]: nss_getpwnam: name '0'
>>>>> does not map into domain 'redacted.com'
>>>>> May 21 10:21:25 virthost1 rpc.idmapd[1731]: nss_getpwnam: name '0'
>>>>> does not map into domain 'redacted.com'
>>>>> May 21 12:46:40 virthost1 rpc.idmapd[1731]: nss_getpwnam: name '0'
>>>>> does not map into domain 'redacted.com'
>>>>>

Re: console proxy times out

2014-05-23 Thread Ian Young
I destroyed the SSVM and then tried hacking the database to make CloudStack
realize that the console proxy is in fact stopped.

mysql> update vm_instance set state='Stopped' where name='v-2-VM';
mysql> update host set status='Up' where name='v-2-VM';

Now they're both running and I can see the console.  There's got to be a
better way to use this system without having to reboot or hack the database
daily.


On Fri, May 23, 2014 at 10:42 AM, Ian Young  wrote:

> Also, is this normal?  Every time the server is rebooted, it adds another
> record to the mshost table but the "removed" field is always NULL.
>
> http://pastebin.com/q5zDCu4b
>
>
> On Fri, May 23, 2014 at 10:39 AM, Ian Young wrote:
>
>> The SSVM is stopped.  If I try to start it, it complains about
>> insufficient capacity.  CPU?  RAM?  I have plenty of both available.
>>
>> 2014-05-23 10:36:51,196 DEBUG [c.c.d.FirstFitPlanner]
>> (Job-Executor-2:ctx-ababac38 ctx-0906d3c3) Listing clusters in order of
>> aggregate capacity, that have (atleast one host with) enough CPU and RAM
>> capacity under this Pod: 1
>> 2014-05-23 10:36:51,198 DEBUG [c.c.d.FirstFitPlanner]
>> (Job-Executor-2:ctx-ababac38 ctx-0906d3c3) Removing from the clusterId list
>> these clusters from avoid set: [1]
>> 2014-05-23 10:36:51,198 DEBUG [c.c.d.FirstFitPlanner]
>> (Job-Executor-2:ctx-ababac38 ctx-0906d3c3) No clusters found after removing
>> disabled clusters and clusters in avoid list, returning.
>> 2014-05-23 10:36:51,201 DEBUG [c.c.c.CapacityManagerImpl]
>> (Job-Executor-2:ctx-ababac38 ctx-0906d3c3) VM state transitted from
>> :Starting to Stopped with event: OperationFailedvm's original host id: 1
>> new host id: null host id before state transition: null
>> 2014-05-23 10:36:51,201 WARN  [c.c.s.s.SecondaryStorageManagerImpl]
>> (Job-Executor-2:ctx-ababac38 ctx-0906d3c3) Exception while trying to start
>> secondary storage vm
>> com.cloud.exception.InsufficientServerCapacityException: Unable to create
>> a deployment for VM[SecondaryStorageVm|s-1-VM]Scope=interface
>> com.cloud.dc.DataCenter; id=1
>>
>>
>> On Fri, May 23, 2014 at 10:35 AM, Ian Young wrote:
>>
>>> I rebooted it and now it's in an even more broken state.  It's
>>> repeatedly trying to stop the console proxy but can't because its state is
>>> "Starting."  Here is an excerpt from the management log:
>>>
>>> http://pastebin.com/FiaDzKXb
>>>
>>> The agent log keeps repeating these messages:
>>>
>>> http://pastebin.com/yDidSbrz
>>>
>>> What's wrong with it?
>>>
>>>
>>> On Thu, May 22, 2014 at 12:55 PM, Ian Young wrote:
>>>
>>>> I wonder if something is wrong with the NFS mount.  I see this error
>>>> periodically in /var/log/messages even though I have set the Domain in
>>>> /etc/idmapd.conf to the host's FQDN:
>>>>
>>>> May 20 19:30:22 virthost1 rpc.idmapd[1790]: nss_getpwnam: name '0' does
>>>> not map into domain 'redacted.com'
>>>> May 20 19:36:20 virthost1 rpc.idmapd[1731]: nss_getpwnam: name '0' does
>>>> not map into domain 'redacted.com'
>>>> May 20 19:44:35 virthost1 rpc.idmapd[1731]: nss_getpwnam: name '0' does
>>>> not map into domain 'redacted.com'
>>>> May 21 10:21:25 virthost1 rpc.idmapd[1731]: nss_getpwnam: name '0' does
>>>> not map into domain 'redacted.com'
>>>> May 21 12:46:40 virthost1 rpc.idmapd[1731]: nss_getpwnam: name '0' does
>>>> not map into domain 'redacted.com'
>>>> May 21 13:52:42 virthost1 rpc.idmapd[1731]: nss_getpwnam: name '0' does
>>>> not map into domain 'redacted.com'
>>>> May 21 13:55:20 virthost1 rpc.idmapd[1731]: nss_getpwnam: name '107'
>>>> does not map into domain 'redacted.com'
>>>> May 21 20:31:51 virthost1 rpc.idmapd[1731]: nss_getpwnam: name '107'
>>>> does not map into domain 'redacted.com'
>>>> May 22 10:14:18 virthost1 rpc.idmapd[1731]: nss_getpwnam: name '0' does
>>>> not map into domain 'redacted.com'
>>>> May 22 10:18:40 virthost1 rpc.idmapd[1731]: nss_getpwnam: name '107'
>>>> does not map into domain 'redacted.com'
>>>> May 22 10:19:23 virthost1 rpc.idmapd[1731]: nss_getpwnam: name '0' does
>>>> not map into domain 'redacted

Re: console proxy times out

2014-05-23 Thread Ian Young
Also, is this normal?  Every time the server is rebooted, it adds another
record to the mshost table but the "removed" field is always NULL.

http://pastebin.com/q5zDCu4b


On Fri, May 23, 2014 at 10:39 AM, Ian Young  wrote:

> The SSVM is stopped.  If I try to start it, it complains about
> insufficient capacity.  CPU?  RAM?  I have plenty of both available.
>
> 2014-05-23 10:36:51,196 DEBUG [c.c.d.FirstFitPlanner]
> (Job-Executor-2:ctx-ababac38 ctx-0906d3c3) Listing clusters in order of
> aggregate capacity, that have (atleast one host with) enough CPU and RAM
> capacity under this Pod: 1
> 2014-05-23 10:36:51,198 DEBUG [c.c.d.FirstFitPlanner]
> (Job-Executor-2:ctx-ababac38 ctx-0906d3c3) Removing from the clusterId list
> these clusters from avoid set: [1]
> 2014-05-23 10:36:51,198 DEBUG [c.c.d.FirstFitPlanner]
> (Job-Executor-2:ctx-ababac38 ctx-0906d3c3) No clusters found after removing
> disabled clusters and clusters in avoid list, returning.
> 2014-05-23 10:36:51,201 DEBUG [c.c.c.CapacityManagerImpl]
> (Job-Executor-2:ctx-ababac38 ctx-0906d3c3) VM state transitted from
> :Starting to Stopped with event: OperationFailedvm's original host id: 1
> new host id: null host id before state transition: null
> 2014-05-23 10:36:51,201 WARN  [c.c.s.s.SecondaryStorageManagerImpl]
> (Job-Executor-2:ctx-ababac38 ctx-0906d3c3) Exception while trying to start
> secondary storage vm
> com.cloud.exception.InsufficientServerCapacityException: Unable to create
> a deployment for VM[SecondaryStorageVm|s-1-VM]Scope=interface
> com.cloud.dc.DataCenter; id=1
>
>
> On Fri, May 23, 2014 at 10:35 AM, Ian Young wrote:
>
>> I rebooted it and now it's in an even more broken state.  It's repeatedly
>> trying to stop the console proxy but can't because its state is "Starting."
>>  Here is an excerpt from the management log:
>>
>> http://pastebin.com/FiaDzKXb
>>
>> The agent log keeps repeating these messages:
>>
>> http://pastebin.com/yDidSbrz
>>
>> What's wrong with it?
>>
>>
>> On Thu, May 22, 2014 at 12:55 PM, Ian Young wrote:
>>
>>> I wonder if something is wrong with the NFS mount.  I see this error
>>> periodically in /var/log/messages even though I have set the Domain in
>>> /etc/idmapd.conf to the host's FQDN:
>>>
>>> May 20 19:30:22 virthost1 rpc.idmapd[1790]: nss_getpwnam: name '0' does
>>> not map into domain 'redacted.com'
>>> May 20 19:36:20 virthost1 rpc.idmapd[1731]: nss_getpwnam: name '0' does
>>> not map into domain 'redacted.com'
>>> May 20 19:44:35 virthost1 rpc.idmapd[1731]: nss_getpwnam: name '0' does
>>> not map into domain 'redacted.com'
>>> May 21 10:21:25 virthost1 rpc.idmapd[1731]: nss_getpwnam: name '0' does
>>> not map into domain 'redacted.com'
>>> May 21 12:46:40 virthost1 rpc.idmapd[1731]: nss_getpwnam: name '0' does
>>> not map into domain 'redacted.com'
>>> May 21 13:52:42 virthost1 rpc.idmapd[1731]: nss_getpwnam: name '0' does
>>> not map into domain 'redacted.com'
>>> May 21 13:55:20 virthost1 rpc.idmapd[1731]: nss_getpwnam: name '107'
>>> does not map into domain 'redacted.com'
>>> May 21 20:31:51 virthost1 rpc.idmapd[1731]: nss_getpwnam: name '107'
>>> does not map into domain 'redacted.com'
>>> May 22 10:14:18 virthost1 rpc.idmapd[1731]: nss_getpwnam: name '0' does
>>> not map into domain 'redacted.com'
>>> May 22 10:18:40 virthost1 rpc.idmapd[1731]: nss_getpwnam: name '107'
>>> does not map into domain 'redacted.com'
>>> May 22 10:19:23 virthost1 rpc.idmapd[1731]: nss_getpwnam: name '0' does
>>> not map into domain 'redacted.com'
>>> May 22 10:25:16 virthost1 rpc.idmapd[1731]: nss_getpwnam: name '107'
>>> does not map into domain 'redacted.com'
>>>
>>> name '107' just started appearing in the log yesterday, which looks
>>> unusual.  Up until then, the error was always name '0'.
>>>
>>>
>>> On Thu, May 22, 2014 at 11:15 AM, Andrija Panic >> > wrote:
>>>
>>>> I have observed this kind of problems ("process blocked for more than xx
>>>> sec...") when I had access with storage - check your disks,  smartctl
>>>> etc...
>>>> best
>>>>
>>>> Sent from Google Nexus 4
>>>> On May 22, 2014 7:49 PM, "Ian Young" 

Re: console proxy times out

2014-05-23 Thread Ian Young
The SSVM is stopped.  If I try to start it, it complains about insufficient
capacity.  CPU?  RAM?  I have plenty of both available.

2014-05-23 10:36:51,196 DEBUG [c.c.d.FirstFitPlanner]
(Job-Executor-2:ctx-ababac38 ctx-0906d3c3) Listing clusters in order of
aggregate capacity, that have (atleast one host with) enough CPU and RAM
capacity under this Pod: 1
2014-05-23 10:36:51,198 DEBUG [c.c.d.FirstFitPlanner]
(Job-Executor-2:ctx-ababac38 ctx-0906d3c3) Removing from the clusterId list
these clusters from avoid set: [1]
2014-05-23 10:36:51,198 DEBUG [c.c.d.FirstFitPlanner]
(Job-Executor-2:ctx-ababac38 ctx-0906d3c3) No clusters found after removing
disabled clusters and clusters in avoid list, returning.
2014-05-23 10:36:51,201 DEBUG [c.c.c.CapacityManagerImpl]
(Job-Executor-2:ctx-ababac38 ctx-0906d3c3) VM state transitted from
:Starting to Stopped with event: OperationFailedvm's original host id: 1
new host id: null host id before state transition: null
2014-05-23 10:36:51,201 WARN  [c.c.s.s.SecondaryStorageManagerImpl]
(Job-Executor-2:ctx-ababac38 ctx-0906d3c3) Exception while trying to start
secondary storage vm
com.cloud.exception.InsufficientServerCapacityException: Unable to create a
deployment for VM[SecondaryStorageVm|s-1-VM]Scope=interface
com.cloud.dc.DataCenter; id=1


On Fri, May 23, 2014 at 10:35 AM, Ian Young  wrote:

> I rebooted it and now it's in an even more broken state.  It's repeatedly
> trying to stop the console proxy but can't because its state is "Starting."
>  Here is an excerpt from the management log:
>
> http://pastebin.com/FiaDzKXb
>
> The agent log keeps repeating these messages:
>
> http://pastebin.com/yDidSbrz
>
> What's wrong with it?
>
>
> On Thu, May 22, 2014 at 12:55 PM, Ian Young wrote:
>
>> I wonder if something is wrong with the NFS mount.  I see this error
>> periodically in /var/log/messages even though I have set the Domain in
>> /etc/idmapd.conf to the host's FQDN:
>>
>> May 20 19:30:22 virthost1 rpc.idmapd[1790]: nss_getpwnam: name '0' does
>> not map into domain 'redacted.com'
>> May 20 19:36:20 virthost1 rpc.idmapd[1731]: nss_getpwnam: name '0' does
>> not map into domain 'redacted.com'
>> May 20 19:44:35 virthost1 rpc.idmapd[1731]: nss_getpwnam: name '0' does
>> not map into domain 'redacted.com'
>> May 21 10:21:25 virthost1 rpc.idmapd[1731]: nss_getpwnam: name '0' does
>> not map into domain 'redacted.com'
>> May 21 12:46:40 virthost1 rpc.idmapd[1731]: nss_getpwnam: name '0' does
>> not map into domain 'redacted.com'
>> May 21 13:52:42 virthost1 rpc.idmapd[1731]: nss_getpwnam: name '0' does
>> not map into domain 'redacted.com'
>> May 21 13:55:20 virthost1 rpc.idmapd[1731]: nss_getpwnam: name '107' does
>> not map into domain 'redacted.com'
>> May 21 20:31:51 virthost1 rpc.idmapd[1731]: nss_getpwnam: name '107' does
>> not map into domain 'redacted.com'
>> May 22 10:14:18 virthost1 rpc.idmapd[1731]: nss_getpwnam: name '0' does
>> not map into domain 'redacted.com'
>> May 22 10:18:40 virthost1 rpc.idmapd[1731]: nss_getpwnam: name '107' does
>> not map into domain 'redacted.com'
>> May 22 10:19:23 virthost1 rpc.idmapd[1731]: nss_getpwnam: name '0' does
>> not map into domain 'redacted.com'
>> May 22 10:25:16 virthost1 rpc.idmapd[1731]: nss_getpwnam: name '107' does
>> not map into domain 'redacted.com'
>>
>> name '107' just started appearing in the log yesterday, which looks
>> unusual.  Up until then, the error was always name '0'.
>>
>>
>> On Thu, May 22, 2014 at 11:15 AM, Andrija Panic 
>> wrote:
>>
>>> I have observed this kind of problems ("process blocked for more than xx
>>> sec...") when I had access with storage - check your disks,  smartctl
>>> etc...
>>> best
>>>
>>> Sent from Google Nexus 4
>>> On May 22, 2014 7:49 PM, "Ian Young"  wrote:
>>>
>>> > And this is in /var/log/messages right before that event:
>>> >
>>> > May 22 10:16:07 virthost1 kernel: INFO: task qemu-kvm:2971 blocked for
>>> more
>>> > than 120 seconds.
>>> > May 22 10:16:07 virthost1 kernel:  Not tainted
>>> > 2.6.32-431.11.2.el6.x86_64 #1
>>> > May 22 10:16:07 virthost1 kernel: "echo 0 >
>>> > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>>> > May 22 10:16:07 virthost1 kernel: qemu-kvm  

Re: console proxy times out

2014-05-23 Thread Ian Young
I rebooted it and now it's in an even more broken state.  It's repeatedly
trying to stop the console proxy but can't because its state is "Starting."
 Here is an excerpt from the management log:

http://pastebin.com/FiaDzKXb

The agent log keeps repeating these messages:

http://pastebin.com/yDidSbrz

What's wrong with it?


On Thu, May 22, 2014 at 12:55 PM, Ian Young  wrote:

> I wonder if something is wrong with the NFS mount.  I see this error
> periodically in /var/log/messages even though I have set the Domain in
> /etc/idmapd.conf to the host's FQDN:
>
> May 20 19:30:22 virthost1 rpc.idmapd[1790]: nss_getpwnam: name '0' does
> not map into domain 'redacted.com'
> May 20 19:36:20 virthost1 rpc.idmapd[1731]: nss_getpwnam: name '0' does
> not map into domain 'redacted.com'
> May 20 19:44:35 virthost1 rpc.idmapd[1731]: nss_getpwnam: name '0' does
> not map into domain 'redacted.com'
> May 21 10:21:25 virthost1 rpc.idmapd[1731]: nss_getpwnam: name '0' does
> not map into domain 'redacted.com'
> May 21 12:46:40 virthost1 rpc.idmapd[1731]: nss_getpwnam: name '0' does
> not map into domain 'redacted.com'
> May 21 13:52:42 virthost1 rpc.idmapd[1731]: nss_getpwnam: name '0' does
> not map into domain 'redacted.com'
> May 21 13:55:20 virthost1 rpc.idmapd[1731]: nss_getpwnam: name '107' does
> not map into domain 'redacted.com'
> May 21 20:31:51 virthost1 rpc.idmapd[1731]: nss_getpwnam: name '107' does
> not map into domain 'redacted.com'
> May 22 10:14:18 virthost1 rpc.idmapd[1731]: nss_getpwnam: name '0' does
> not map into domain 'redacted.com'
> May 22 10:18:40 virthost1 rpc.idmapd[1731]: nss_getpwnam: name '107' does
> not map into domain 'redacted.com'
> May 22 10:19:23 virthost1 rpc.idmapd[1731]: nss_getpwnam: name '0' does
> not map into domain 'redacted.com'
> May 22 10:25:16 virthost1 rpc.idmapd[1731]: nss_getpwnam: name '107' does
> not map into domain 'redacted.com'
>
> name '107' just started appearing in the log yesterday, which looks
> unusual.  Up until then, the error was always name '0'.
>
>
> On Thu, May 22, 2014 at 11:15 AM, Andrija Panic 
> wrote:
>
>> I have observed this kind of problems ("process blocked for more than xx
>> sec...") when I had access with storage - check your disks,  smartctl
>> etc...
>> best
>>
>> Sent from Google Nexus 4
>> On May 22, 2014 7:49 PM, "Ian Young"  wrote:
>>
>> > And this is in /var/log/messages right before that event:
>> >
>> > May 22 10:16:07 virthost1 kernel: INFO: task qemu-kvm:2971 blocked for
>> more
>> > than 120 seconds.
>> > May 22 10:16:07 virthost1 kernel:  Not tainted
>> > 2.6.32-431.11.2.el6.x86_64 #1
>> > May 22 10:16:07 virthost1 kernel: "echo 0 >
>> > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>> > May 22 10:16:07 virthost1 kernel: qemu-kvm  D 0002 0
>> >  2971  1 0x0080
>> > May 22 10:16:07 virthost1 kernel: 8810724e9be8 0082
>> >  88106b6529d8
>> > May 22 10:16:07 virthost1 kernel: 880871b3e8d8 880871b3e8f0
>> > 8100bb8e 8810724e9be8
>> > May 22 10:16:07 virthost1 kernel: 881073525058 8810724e9fd8
>> > fbc8 881073525058
>> > May 22 10:16:07 virthost1 kernel: Call Trace:
>> > May 22 10:16:07 virthost1 kernel: [] ?
>> > apic_timer_interrupt+0xe/0x20
>> > May 22 10:16:07 virthost1 kernel: [] ?
>> > mutex_spin_on_owner+0x9f/0xc0
>> > May 22 10:16:07 virthost1 kernel: []
>> > __mutex_lock_slowpath+0x13e/0x180
>> > May 22 10:16:07 virthost1 kernel: []
>> mutex_lock+0x2b/0x50
>> > May 22 10:16:07 virthost1 kernel: []
>> > memory_access_ok+0x7f/0xc0 [vhost_net]
>> > May 22 10:16:07 virthost1 kernel: []
>> > vhost_dev_ioctl+0x2ec/0xa50 [vhost_net]
>> > May 22 10:16:07 virthost1 kernel: [] ?
>> > vhost_work_flush+0xe1/0x120 [vhost_net]
>> > May 22 10:16:07 virthost1 kernel: [] ?
>> > avc_has_perm+0x71/0x90
>> > May 22 10:16:07 virthost1 kernel: []
>> > vhost_net_ioctl+0x7a/0x5d0 [vhost_net]
>> > May 22 10:16:07 virthost1 kernel: [] ?
>> > inode_has_perm+0x54/0xa0
>> > May 22 10:16:07 virthost1 kernel: [] ?
>> > kvm_vcpu_ioctl+0x1e7/0x580 [kvm]
>> > May 22 10:16:07 virthost1 kernel:

Re: console proxy times out

2014-05-22 Thread Ian Young
I wonder if something is wrong with the NFS mount.  I see this error
periodically in /var/log/messages even though I have set the Domain in
/etc/idmapd.conf to the host's FQDN:

May 20 19:30:22 virthost1 rpc.idmapd[1790]: nss_getpwnam: name '0' does not
map into domain 'redacted.com'
May 20 19:36:20 virthost1 rpc.idmapd[1731]: nss_getpwnam: name '0' does not
map into domain 'redacted.com'
May 20 19:44:35 virthost1 rpc.idmapd[1731]: nss_getpwnam: name '0' does not
map into domain 'redacted.com'
May 21 10:21:25 virthost1 rpc.idmapd[1731]: nss_getpwnam: name '0' does not
map into domain 'redacted.com'
May 21 12:46:40 virthost1 rpc.idmapd[1731]: nss_getpwnam: name '0' does not
map into domain 'redacted.com'
May 21 13:52:42 virthost1 rpc.idmapd[1731]: nss_getpwnam: name '0' does not
map into domain 'redacted.com'
May 21 13:55:20 virthost1 rpc.idmapd[1731]: nss_getpwnam: name '107' does
not map into domain 'redacted.com'
May 21 20:31:51 virthost1 rpc.idmapd[1731]: nss_getpwnam: name '107' does
not map into domain 'redacted.com'
May 22 10:14:18 virthost1 rpc.idmapd[1731]: nss_getpwnam: name '0' does not
map into domain 'redacted.com'
May 22 10:18:40 virthost1 rpc.idmapd[1731]: nss_getpwnam: name '107' does
not map into domain 'redacted.com'
May 22 10:19:23 virthost1 rpc.idmapd[1731]: nss_getpwnam: name '0' does not
map into domain 'redacted.com'
May 22 10:25:16 virthost1 rpc.idmapd[1731]: nss_getpwnam: name '107' does
not map into domain 'redacted.com'

name '107' just started appearing in the log yesterday, which looks
unusual.  Up until then, the error was always name '0'.


On Thu, May 22, 2014 at 11:15 AM, Andrija Panic wrote:

> I have observed this kind of problems ("process blocked for more than xx
> sec...") when I had access with storage - check your disks,  smartctl
> etc...
> best
>
> Sent from Google Nexus 4
> On May 22, 2014 7:49 PM, "Ian Young"  wrote:
>
> > And this is in /var/log/messages right before that event:
> >
> > May 22 10:16:07 virthost1 kernel: INFO: task qemu-kvm:2971 blocked for
> more
> > than 120 seconds.
> > May 22 10:16:07 virthost1 kernel:  Not tainted
> > 2.6.32-431.11.2.el6.x86_64 #1
> > May 22 10:16:07 virthost1 kernel: "echo 0 >
> > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> > May 22 10:16:07 virthost1 kernel: qemu-kvm  D 0002 0
> >  2971  1 0x0080
> > May 22 10:16:07 virthost1 kernel: 8810724e9be8 0082
> >  88106b6529d8
> > May 22 10:16:07 virthost1 kernel: 880871b3e8d8 880871b3e8f0
> > 8100bb8e 8810724e9be8
> > May 22 10:16:07 virthost1 kernel: 881073525058 8810724e9fd8
> > fbc8 881073525058
> > May 22 10:16:07 virthost1 kernel: Call Trace:
> > May 22 10:16:07 virthost1 kernel: [] ?
> > apic_timer_interrupt+0xe/0x20
> > May 22 10:16:07 virthost1 kernel: [] ?
> > mutex_spin_on_owner+0x9f/0xc0
> > May 22 10:16:07 virthost1 kernel: []
> > __mutex_lock_slowpath+0x13e/0x180
> > May 22 10:16:07 virthost1 kernel: []
> mutex_lock+0x2b/0x50
> > May 22 10:16:07 virthost1 kernel: []
> > memory_access_ok+0x7f/0xc0 [vhost_net]
> > May 22 10:16:07 virthost1 kernel: []
> > vhost_dev_ioctl+0x2ec/0xa50 [vhost_net]
> > May 22 10:16:07 virthost1 kernel: [] ?
> > vhost_work_flush+0xe1/0x120 [vhost_net]
> > May 22 10:16:07 virthost1 kernel: [] ?
> > avc_has_perm+0x71/0x90
> > May 22 10:16:07 virthost1 kernel: []
> > vhost_net_ioctl+0x7a/0x5d0 [vhost_net]
> > May 22 10:16:07 virthost1 kernel: [] ?
> > inode_has_perm+0x54/0xa0
> > May 22 10:16:07 virthost1 kernel: [] ?
> > kvm_vcpu_ioctl+0x1e7/0x580 [kvm]
> > May 22 10:16:07 virthost1 kernel: [] ?
> > send_signal+0x3e/0x90
> > May 22 10:16:07 virthost1 kernel: []
> vfs_ioctl+0x22/0xa0
> > May 22 10:16:07 virthost1 kernel: []
> > do_vfs_ioctl+0x84/0x580
> > May 22 10:16:07 virthost1 kernel: []
> sys_ioctl+0x81/0xa0
> > May 22 10:16:07 virthost1 kernel: [] ?
> > __audit_syscall_exit+0x25e/0x290
> > May 22 10:16:07 virthost1 kernel: []
> > system_call_fastpath+0x16/0x1b
> >
> >
> > On Thu, May 22, 2014 at 10:39 AM, Ian Young 
> > wrote:
> >
> > > The console proxy became unavailable again yesterday afternoon.  I
> could
> > > SSH into it via its link local address and nothing seemed to be wrong
> > > inside the VM itself.  However, the qemu-kv

Re: console proxy times out

2014-05-22 Thread Ian Young
And this is in /var/log/messages right before that event:

May 22 10:16:07 virthost1 kernel: INFO: task qemu-kvm:2971 blocked for more
than 120 seconds.
May 22 10:16:07 virthost1 kernel:  Not tainted
2.6.32-431.11.2.el6.x86_64 #1
May 22 10:16:07 virthost1 kernel: "echo 0 >
/proc/sys/kernel/hung_task_timeout_secs" disables this message.
May 22 10:16:07 virthost1 kernel: qemu-kvm  D 0002 0
 2971  1 0x0080
May 22 10:16:07 virthost1 kernel: 8810724e9be8 0082
 88106b6529d8
May 22 10:16:07 virthost1 kernel: 880871b3e8d8 880871b3e8f0
8100bb8e 8810724e9be8
May 22 10:16:07 virthost1 kernel: 881073525058 8810724e9fd8
fbc8 881073525058
May 22 10:16:07 virthost1 kernel: Call Trace:
May 22 10:16:07 virthost1 kernel: [] ?
apic_timer_interrupt+0xe/0x20
May 22 10:16:07 virthost1 kernel: [] ?
mutex_spin_on_owner+0x9f/0xc0
May 22 10:16:07 virthost1 kernel: []
__mutex_lock_slowpath+0x13e/0x180
May 22 10:16:07 virthost1 kernel: [] mutex_lock+0x2b/0x50
May 22 10:16:07 virthost1 kernel: []
memory_access_ok+0x7f/0xc0 [vhost_net]
May 22 10:16:07 virthost1 kernel: []
vhost_dev_ioctl+0x2ec/0xa50 [vhost_net]
May 22 10:16:07 virthost1 kernel: [] ?
vhost_work_flush+0xe1/0x120 [vhost_net]
May 22 10:16:07 virthost1 kernel: [] ?
avc_has_perm+0x71/0x90
May 22 10:16:07 virthost1 kernel: []
vhost_net_ioctl+0x7a/0x5d0 [vhost_net]
May 22 10:16:07 virthost1 kernel: [] ?
inode_has_perm+0x54/0xa0
May 22 10:16:07 virthost1 kernel: [] ?
kvm_vcpu_ioctl+0x1e7/0x580 [kvm]
May 22 10:16:07 virthost1 kernel: [] ?
send_signal+0x3e/0x90
May 22 10:16:07 virthost1 kernel: [] vfs_ioctl+0x22/0xa0
May 22 10:16:07 virthost1 kernel: []
do_vfs_ioctl+0x84/0x580
May 22 10:16:07 virthost1 kernel: [] sys_ioctl+0x81/0xa0
May 22 10:16:07 virthost1 kernel: [] ?
__audit_syscall_exit+0x25e/0x290
May 22 10:16:07 virthost1 kernel: []
system_call_fastpath+0x16/0x1b


On Thu, May 22, 2014 at 10:39 AM, Ian Young  wrote:

> The console proxy became unavailable again yesterday afternoon.  I could
> SSH into it via its link local address and nothing seemed to be wrong
> inside the VM itself.  However, the qemu-kvm process for that VM was at
> almost 100% CPU.  Inside the VM, the CPU usage was minimal and the java
> process was running and listening on port 443.  So there seems to be
> something wrong with it down at the KVM/QEMU level.  It's weird how this
> keeps happening to the console proxy only and not any of the other VMs.  I
> tried to reboot it from the management UI and after about 15 minutes, it
> finally did.  Now the console proxy is working but I don't know how long it
> will last before it breaks again.  I found this in libvirtd.log, which
> corresponds with the time the console proxy rebooted:
>
> 2014-05-22 17:17:04.362+: 25195: info : libvirt version: 0.10.2,
> package: 29.el6_5.7 (CentOS BuildSystem <http://bugs.centos.org>,
> 2014-04-07-07:42:04, c6b9.bsys.dev.centos.org)
> 2014-05-22 17:17:04.362+: 25195: error : qemuMonitorIO:614 : internal
> error End of file from monitor
>
>
> On Wed, May 21, 2014 at 2:07 PM, Ian Young  wrote:
>
>> I built and installed a libvirt 1.04 package from the Fedora src rpm.  It
>> installed fine inside a test VM but installing it on the real hypervisor
>> was a bad idea and I doubt I'll be pursuing it further.  All VMs promptly
>> stopped and this appeared in libvirtd.log:
>>
>> 2014-05-21 20:36:19.260+: 23567: info : libvirt version: 1.0.4,
>> package: 1.el6 (Unknown, 2014-05-21-11:36:09, redacted.com)
>> 2014-05-21 20:36:19.260+: 23567: warning : virDriverLoadModule:72 :
>> Module /usr/lib64/libvirt/connection-driver/libvirt_driver_network.so not
>> accessible
>> 2014-05-21 20:36:19.260+: 23567: warning : virDriverLoadModule:72 :
>> Module /usr/lib64/libvirt/connection-driver/libvirt_driver_storage.so not
>> accessible
>> 2014-05-21 20:36:19.260+: 23567: warning : virDriverLoadModule:72 :
>> Module /usr/lib64/libvirt/connection-driver/libvirt_driver_nodedev.so not
>> accessible
>> 2014-05-21 20:36:19.260+: 23567: warning : virDriverLoadModule:72 :
>> Module /usr/lib64/libvirt/connection-driver/libvirt_driver_secret.so not
>> accessible
>> 2014-05-21 20:36:19.260+: 23567: warning : virDriverLoadModule:72 :
>> Module /usr/lib64/libvirt/connection-driver/libvirt_driver_nwfilter.so not
>> accessible
>> 2014-05-21 20:36:19.260+: 23567: warning : virDriverLoadModule:72 :
>> Module /usr/lib64/libvirt/connection-driver/libvirt_driver_interface.so not
>> accessible
>> 2014-05-21 20:36:19.260+: 23567: warning : virDriverLoadModule:72 :
>> Module /usr/lib64/libvirt/connection-driver/libvirt_driver_qemu.so not
>> accessible
&g

Re: console proxy times out

2014-05-22 Thread Ian Young
The console proxy became unavailable again yesterday afternoon.  I could
SSH into it via its link local address and nothing seemed to be wrong
inside the VM itself.  However, the qemu-kvm process for that VM was at
almost 100% CPU.  Inside the VM, the CPU usage was minimal and the java
process was running and listening on port 443.  So there seems to be
something wrong with it down at the KVM/QEMU level.  It's weird how this
keeps happening to the console proxy only and not any of the other VMs.  I
tried to reboot it from the management UI and after about 15 minutes, it
finally did.  Now the console proxy is working but I don't know how long it
will last before it breaks again.  I found this in libvirtd.log, which
corresponds with the time the console proxy rebooted:

2014-05-22 17:17:04.362+: 25195: info : libvirt version: 0.10.2,
package: 29.el6_5.7 (CentOS BuildSystem <http://bugs.centos.org>,
2014-04-07-07:42:04, c6b9.bsys.dev.centos.org)
2014-05-22 17:17:04.362+: 25195: error : qemuMonitorIO:614 : internal
error End of file from monitor


On Wed, May 21, 2014 at 2:07 PM, Ian Young  wrote:

> I built and installed a libvirt 1.04 package from the Fedora src rpm.  It
> installed fine inside a test VM but installing it on the real hypervisor
> was a bad idea and I doubt I'll be pursuing it further.  All VMs promptly
> stopped and this appeared in libvirtd.log:
>
> 2014-05-21 20:36:19.260+: 23567: info : libvirt version: 1.0.4,
> package: 1.el6 (Unknown, 2014-05-21-11:36:09, redacted.com)
> 2014-05-21 20:36:19.260+: 23567: warning : virDriverLoadModule:72 :
> Module /usr/lib64/libvirt/connection-driver/libvirt_driver_network.so not
> accessible
> 2014-05-21 20:36:19.260+: 23567: warning : virDriverLoadModule:72 :
> Module /usr/lib64/libvirt/connection-driver/libvirt_driver_storage.so not
> accessible
> 2014-05-21 20:36:19.260+: 23567: warning : virDriverLoadModule:72 :
> Module /usr/lib64/libvirt/connection-driver/libvirt_driver_nodedev.so not
> accessible
> 2014-05-21 20:36:19.260+: 23567: warning : virDriverLoadModule:72 :
> Module /usr/lib64/libvirt/connection-driver/libvirt_driver_secret.so not
> accessible
> 2014-05-21 20:36:19.260+: 23567: warning : virDriverLoadModule:72 :
> Module /usr/lib64/libvirt/connection-driver/libvirt_driver_nwfilter.so not
> accessible
> 2014-05-21 20:36:19.260+: 23567: warning : virDriverLoadModule:72 :
> Module /usr/lib64/libvirt/connection-driver/libvirt_driver_interface.so not
> accessible
> 2014-05-21 20:36:19.260+: 23567: warning : virDriverLoadModule:72 :
> Module /usr/lib64/libvirt/connection-driver/libvirt_driver_qemu.so not
> accessible
> 2014-05-21 20:36:19.260+: 23567: warning : virDriverLoadModule:72 :
> Module /usr/lib64/libvirt/connection-driver/libvirt_driver_lxc.so not
> accessible
> 2014-05-21 20:36:49.471+: 23570: error : do_open:1220 : no connection
> driver available for qemu:///system
> 2014-05-21 20:36:49.472+: 23567: error : virNetSocketReadWire:1370 :
> End of file while reading data: Input/output error
> 2014-05-21 20:36:49.473+: 23571: error : do_open:1220 : no connection
> driver available for lxc:///
> 2014-05-21 20:36:49.474+: 23567: error : virNetSocketReadWire:1370 :
> End of file while reading data: Input/output error
> 2014-05-21 20:36:49.475+: 23568: error : do_open:1220 : no connection
> driver available for qemu:///system
> 2014-05-21 20:36:49.476+: 23567: error : virNetSocketReadWire:1370 :
> End of file while reading data: Input/output error
> 2014-05-21 20:36:49.678+: 23575: error : do_open:1220 : no connection
> driver available for qemu:///system
> 2014-05-21 20:36:49.678+: 23567: error : virNetSocketReadWire:1370 :
> End of file while reading data: Input/output error
> 2014-05-21 20:36:49.681+: 23572: error : do_open:1220 : no connection
> driver available for qemu:///system
> 2014-05-21 20:36:49.682+: 23567: error : virNetSocketReadWire:1370 :
> End of file while reading data: Input/output error
>
>
> On Wed, May 21, 2014 at 10:45 AM, Ian Young wrote:
>
>> I was able to get it working by following these steps:
>>
>> 1. stop all instances
>> 2. service cloudstack-management stop
>> 3. service cloudstack-agent stop
>> 4. virsh shutdown {domain} (for each of the system VMs)
>> 5. service libvirtd stop
>> 6. umount primary and secondary
>> 7. reboot
>>
>> The console proxy is working again.  I expect it will probably break
>> again in a day or two.  I have a feeling it's a result of this libvirtd
>> bug, since I've seen the "cannot acquire state change lock" several times.
>>
>> https://bugs.launchpad.net/nova/+bug/1254872
>>
>> I might try buildin

Re: console proxy times out

2014-05-21 Thread Ian Young
I built and installed a libvirt 1.04 package from the Fedora src rpm.  It
installed fine inside a test VM but installing it on the real hypervisor
was a bad idea and I doubt I'll be pursuing it further.  All VMs promptly
stopped and this appeared in libvirtd.log:

2014-05-21 20:36:19.260+: 23567: info : libvirt version: 1.0.4,
package: 1.el6 (Unknown, 2014-05-21-11:36:09, redacted.com)
2014-05-21 20:36:19.260+: 23567: warning : virDriverLoadModule:72 :
Module /usr/lib64/libvirt/connection-driver/libvirt_driver_network.so not
accessible
2014-05-21 20:36:19.260+: 23567: warning : virDriverLoadModule:72 :
Module /usr/lib64/libvirt/connection-driver/libvirt_driver_storage.so not
accessible
2014-05-21 20:36:19.260+: 23567: warning : virDriverLoadModule:72 :
Module /usr/lib64/libvirt/connection-driver/libvirt_driver_nodedev.so not
accessible
2014-05-21 20:36:19.260+: 23567: warning : virDriverLoadModule:72 :
Module /usr/lib64/libvirt/connection-driver/libvirt_driver_secret.so not
accessible
2014-05-21 20:36:19.260+: 23567: warning : virDriverLoadModule:72 :
Module /usr/lib64/libvirt/connection-driver/libvirt_driver_nwfilter.so not
accessible
2014-05-21 20:36:19.260+: 23567: warning : virDriverLoadModule:72 :
Module /usr/lib64/libvirt/connection-driver/libvirt_driver_interface.so not
accessible
2014-05-21 20:36:19.260+: 23567: warning : virDriverLoadModule:72 :
Module /usr/lib64/libvirt/connection-driver/libvirt_driver_qemu.so not
accessible
2014-05-21 20:36:19.260+: 23567: warning : virDriverLoadModule:72 :
Module /usr/lib64/libvirt/connection-driver/libvirt_driver_lxc.so not
accessible
2014-05-21 20:36:49.471+: 23570: error : do_open:1220 : no connection
driver available for qemu:///system
2014-05-21 20:36:49.472+: 23567: error : virNetSocketReadWire:1370 :
End of file while reading data: Input/output error
2014-05-21 20:36:49.473+: 23571: error : do_open:1220 : no connection
driver available for lxc:///
2014-05-21 20:36:49.474+: 23567: error : virNetSocketReadWire:1370 :
End of file while reading data: Input/output error
2014-05-21 20:36:49.475+: 23568: error : do_open:1220 : no connection
driver available for qemu:///system
2014-05-21 20:36:49.476+: 23567: error : virNetSocketReadWire:1370 :
End of file while reading data: Input/output error
2014-05-21 20:36:49.678+: 23575: error : do_open:1220 : no connection
driver available for qemu:///system
2014-05-21 20:36:49.678+: 23567: error : virNetSocketReadWire:1370 :
End of file while reading data: Input/output error
2014-05-21 20:36:49.681+: 23572: error : do_open:1220 : no connection
driver available for qemu:///system
2014-05-21 20:36:49.682+: 23567: error : virNetSocketReadWire:1370 :
End of file while reading data: Input/output error


On Wed, May 21, 2014 at 10:45 AM, Ian Young  wrote:

> I was able to get it working by following these steps:
>
> 1. stop all instances
> 2. service cloudstack-management stop
> 3. service cloudstack-agent stop
> 4. virsh shutdown {domain} (for each of the system VMs)
> 5. service libvirtd stop
> 6. umount primary and secondary
> 7. reboot
>
> The console proxy is working again.  I expect it will probably break again
> in a day or two.  I have a feeling it's a result of this libvirtd bug,
> since I've seen the "cannot acquire state change lock" several times.
>
> https://bugs.launchpad.net/nova/+bug/1254872
>
> I might try building my own libvirtd 1.0.3 for EL6.
>
>
> On Tue, May 20, 2014 at 6:21 PM, Ian Young  wrote:
>
>> So I got the console proxy working via HTTPS (by managing my own "
>> realhostip.com" DNS) last week and everything was working fine.  Today,
>> all of a sudden, the console proxy stopped working again.  The browser
>> says, "Connecting to 192-168-100-159.realhostip.com..." and eventually
>> times out.  I tried to restart it and it went into a "Stopping" state that
>> never completed and the Agent State was "Disconnected."  I could not shut
>> down the VM using virsh or with "kill -9" because libvirtd kept saying,
>> "cannot acquire state change lock," so I gracefully shut down the remaining
>> instances and rebooted the entire management server/hypervisor.  Start over.
>>
>> When it came back up, the SSVM and console proxy started but the virtual
>> router was stopped.  I was able to manually start it from the UI.  The
>> console proxy still times out when I try to access it from a browser.  I
>> don't see any errors in the management or agent logs, just this:
>>
>> 2014-05-20 18:04:27,632 DEBUG [c.c.a.t.Request] (catalina-exec-10:null)
>> Seq 1-2130378876: Sending  { Cmd , MgmtId: 55157049428734, via: 1(
>> virthost1.redacted.com), Ver: v1, Flags: 100011,
&g

Re: console proxy times out

2014-05-21 Thread Ian Young
I was able to get it working by following these steps:

1. stop all instances
2. service cloudstack-management stop
3. service cloudstack-agent stop
4. virsh shutdown {domain} (for each of the system VMs)
5. service libvirtd stop
6. umount primary and secondary
7. reboot

The console proxy is working again.  I expect it will probably break again
in a day or two.  I have a feeling it's a result of this libvirtd bug,
since I've seen the "cannot acquire state change lock" several times.

https://bugs.launchpad.net/nova/+bug/1254872

I might try building my own libvirtd 1.0.3 for EL6.


On Tue, May 20, 2014 at 6:21 PM, Ian Young  wrote:

> So I got the console proxy working via HTTPS (by managing my own "
> realhostip.com" DNS) last week and everything was working fine.  Today,
> all of a sudden, the console proxy stopped working again.  The browser
> says, "Connecting to 192-168-100-159.realhostip.com..." and eventually
> times out.  I tried to restart it and it went into a "Stopping" state that
> never completed and the Agent State was "Disconnected."  I could not shut
> down the VM using virsh or with "kill -9" because libvirtd kept saying,
> "cannot acquire state change lock," so I gracefully shut down the remaining
> instances and rebooted the entire management server/hypervisor.  Start over.
>
> When it came back up, the SSVM and console proxy started but the virtual
> router was stopped.  I was able to manually start it from the UI.  The
> console proxy still times out when I try to access it from a browser.  I
> don't see any errors in the management or agent logs, just this:
>
> 2014-05-20 18:04:27,632 DEBUG [c.c.a.t.Request] (catalina-exec-10:null)
> Seq 1-2130378876: Sending  { Cmd , MgmtId: 55157049428734, via: 1(
> virthost1.redacted.com), Ver: v1, Flags: 100011,
> [{"com.cloud.agent.api.GetVncPortCommand":{"id":4,"name":"r-4-VM","wait":0}}]
> }
> 2014-05-20 18:04:27,684 DEBUG [c.c.a.t.Request]
> (AgentManager-Handler-3:null) Seq 1-2130378876: Processing:  { Ans: ,
> MgmtId: 55157049428734, via: 1, Ver: v1, Flags: 10,
> [{"com.cloud.agent.api.GetVncPortAnswer":{"address":"192.168.100.6","port":5902,"result":true,"wait":0}}]
> }
> 2014-05-20 18:04:27,684 DEBUG [c.c.a.t.Request] (catalina-exec-10:null)
> Seq 1-2130378876: Received:  { Ans: , MgmtId: 55157049428734, via: 1, Ver:
> v1, Flags: 10, { GetVncPortAnswer } }
> 2014-05-20 18:04:27,684 DEBUG [c.c.s.ConsoleProxyServlet]
> (catalina-exec-10:null) Port info 192.168.100.6
> 2014-05-20 18:04:27,684 INFO  [c.c.s.ConsoleProxyServlet]
> (catalina-exec-10:null) Parse host info returned from executing
> GetVNCPortCommand. host info: 192.168.100.6
> 2014-05-20 18:04:27,686 DEBUG [c.c.s.ConsoleProxyServlet]
> (catalina-exec-10:null) Compose console url:
> https://192-168-100-159.realhostip.com/ajax?token=CsPhU4m_R2ZoLIdXOtjo3y3humnQN20wt5fSPjbZOHtRh7nli7tiq0ZiWUuwCVIn7VwF6503ziEqCRejlRsVcsyQcUfemTRXlhAOpJUyRugyCuTjmbUIX3EY1cHnFMKwF8FXXZr_PgwyXGPEoOHhkdRgsyRiczbk_Unuh4KmRngATr0FPCLtqhwIMpnbLSYwpnFDz65k9lEJmK6IlXYKVpWXg2rpVEsQvaNlulrZdhMQ7qUbacn82EG43OY8nmwm1SYB8TrUFH5Btb1RHpJm9A
> 2014-05-20 18:04:27,686 DEBUG [c.c.s.ConsoleProxyServlet]
> (catalina-exec-10:null) the console url is ::
> r-4-VMhttps://192-168-100-159.realhostip.com/ajax?token=CsPhU4m_R2ZoLIdXOtjo3y3humnQN20wt5fSPjbZOHtRh7nli7tiq0ZiWUuwCVIn7VwF6503ziEqCRejlRsVcsyQcUfemTRXlhAOpJUyRugyCuTjmbUIX3EY1cHnFMKwF8FXXZr_PgwyXGPEoOHhkdRgsyRiczbk_Unuh4KmRngATr0FPCLtqhwIMpnbLSYwpnFDz65k9lEJmK6IlXYKVpWXg2rpVEsQvaNlulrZdhMQ7qUbacn82EG43OY8nmwm1SYB8TrUFH5Btb1RHpJm9A
> ">
> 2014-05-20 18:04:29,216 DEBUG [c.c.a.m.AgentManagerImpl]
> (AgentManager-Handler-4:null) SeqA 2-545: Processing Seq 2-545:  { Cmd ,
> MgmtId: -1, via: 2, Ver: v1, Flags: 11,
> [{"com.cloud.agent.api.ConsoleProxyLoadReportCommand":{"_proxyVmId":2,"_loadInfo":"{\n
>  \"connections\": []\n}","wait":0}}] }
>
> If I try to restart the system VMs with cloudstack-sysvmadm, it says:
>
> Stopping and starting 1 secondary storage vm(s)...
> curl: (7) couldn't connect to host
> ERROR: Failed to stop secondary storage vm with id 1
>
> Done stopping and starting secondary storage vm(s)
>
> Stopping and starting 1 console proxy vm(s)...
> curl: (7) couldn't connect to host
> ERROR: Failed to stop console proxy vm with id 2
>
> Done stopping and starting console proxy vm(s) .
>
> Stopping and starting 1 running routing vm(s)...
> curl: (7) couldn't connect to host
> 2
> Done restarting router(s).
>
> I notice there are now four entries for the same management server in the
> mshost table, and they all are in an "Up" state and the "removed" field is
> NULL.  What's wrong with this system?
>


console proxy times out

2014-05-20 Thread Ian Young
So I got the console proxy working via HTTPS (by managing my own "
realhostip.com" DNS) last week and everything was working fine.  Today, all
of a sudden, the console proxy stopped working again.  The browser says,
"Connecting to 192-168-100-159.realhostip.com..." and eventually times out.
 I tried to restart it and it went into a "Stopping" state that never
completed and the Agent State was "Disconnected."  I could not shut down
the VM using virsh or with "kill -9" because libvirtd kept saying, "cannot
acquire state change lock," so I gracefully shut down the remaining
instances and rebooted the entire management server/hypervisor.  Start over.

When it came back up, the SSVM and console proxy started but the virtual
router was stopped.  I was able to manually start it from the UI.  The
console proxy still times out when I try to access it from a browser.  I
don't see any errors in the management or agent logs, just this:

2014-05-20 18:04:27,632 DEBUG [c.c.a.t.Request] (catalina-exec-10:null) Seq
1-2130378876: Sending  { Cmd , MgmtId: 55157049428734, via: 1(
virthost1.redacted.com), Ver: v1, Flags: 100011,
[{"com.cloud.agent.api.GetVncPortCommand":{"id":4,"name":"r-4-VM","wait":0}}]
}
2014-05-20 18:04:27,684 DEBUG [c.c.a.t.Request]
(AgentManager-Handler-3:null) Seq 1-2130378876: Processing:  { Ans: ,
MgmtId: 55157049428734, via: 1, Ver: v1, Flags: 10,
[{"com.cloud.agent.api.GetVncPortAnswer":{"address":"192.168.100.6","port":5902,"result":true,"wait":0}}]
}
2014-05-20 18:04:27,684 DEBUG [c.c.a.t.Request] (catalina-exec-10:null) Seq
1-2130378876: Received:  { Ans: , MgmtId: 55157049428734, via: 1, Ver: v1,
Flags: 10, { GetVncPortAnswer } }
2014-05-20 18:04:27,684 DEBUG [c.c.s.ConsoleProxyServlet]
(catalina-exec-10:null) Port info 192.168.100.6
2014-05-20 18:04:27,684 INFO  [c.c.s.ConsoleProxyServlet]
(catalina-exec-10:null) Parse host info returned from executing
GetVNCPortCommand. host info: 192.168.100.6
2014-05-20 18:04:27,686 DEBUG [c.c.s.ConsoleProxyServlet]
(catalina-exec-10:null) Compose console url:
https://192-168-100-159.realhostip.com/ajax?token=CsPhU4m_R2ZoLIdXOtjo3y3humnQN20wt5fSPjbZOHtRh7nli7tiq0ZiWUuwCVIn7VwF6503ziEqCRejlRsVcsyQcUfemTRXlhAOpJUyRugyCuTjmbUIX3EY1cHnFMKwF8FXXZr_PgwyXGPEoOHhkdRgsyRiczbk_Unuh4KmRngATr0FPCLtqhwIMpnbLSYwpnFDz65k9lEJmK6IlXYKVpWXg2rpVEsQvaNlulrZdhMQ7qUbacn82EG43OY8nmwm1SYB8TrUFH5Btb1RHpJm9A
2014-05-20 18:04:27,686 DEBUG [c.c.s.ConsoleProxyServlet]
(catalina-exec-10:null) the console url is ::
r-4-VMhttps://192-168-100-159.realhostip.com/ajax?token=CsPhU4m_R2ZoLIdXOtjo3y3humnQN20wt5fSPjbZOHtRh7nli7tiq0ZiWUuwCVIn7VwF6503ziEqCRejlRsVcsyQcUfemTRXlhAOpJUyRugyCuTjmbUIX3EY1cHnFMKwF8FXXZr_PgwyXGPEoOHhkdRgsyRiczbk_Unuh4KmRngATr0FPCLtqhwIMpnbLSYwpnFDz65k9lEJmK6IlXYKVpWXg2rpVEsQvaNlulrZdhMQ7qUbacn82EG43OY8nmwm1SYB8TrUFH5Btb1RHpJm9A
">
2014-05-20 18:04:29,216 DEBUG [c.c.a.m.AgentManagerImpl]
(AgentManager-Handler-4:null) SeqA 2-545: Processing Seq 2-545:  { Cmd ,
MgmtId: -1, via: 2, Ver: v1, Flags: 11,
[{"com.cloud.agent.api.ConsoleProxyLoadReportCommand":{"_proxyVmId":2,"_loadInfo":"{\n
 \"connections\": []\n}","wait":0}}] }

If I try to restart the system VMs with cloudstack-sysvmadm, it says:

Stopping and starting 1 secondary storage vm(s)...
curl: (7) couldn't connect to host
ERROR: Failed to stop secondary storage vm with id 1

Done stopping and starting secondary storage vm(s)

Stopping and starting 1 console proxy vm(s)...
curl: (7) couldn't connect to host
ERROR: Failed to stop console proxy vm with id 2

Done stopping and starting console proxy vm(s) .

Stopping and starting 1 running routing vm(s)...
curl: (7) couldn't connect to host
2
Done restarting router(s).

I notice there are now four entries for the same management server in the
mshost table, and they all are in an "Up" state and the "removed" field is
NULL.  What's wrong with this system?


Re: can't ping guest network

2014-05-19 Thread Ian Young
I forgot to add ingress rules to the security group.  It works now.


On Mon, May 19, 2014 at 5:18 PM, Ian Young  wrote:

> My VMs can reach the rest of our internal network and even the internet
> but nothing except the management/hypervisor can reach the VMs.  I
> monitored eth0 on one of the VMs while I tried to SSH to it from another
> workstation and it displayed this:
>
> 17:05:29.031584 ARP, Request who-has monitor.cs1cloud.internal tell
> 192.168.100.166, length 46
>
> I have the network bridge set up correctly and I've tried disabling
> iptables and SELinux just to rule those out.  There must be something
> simple I overlooked.  Why does outbound traffic work but inbound traffic
> doesn't?
>


can't ping guest network

2014-05-19 Thread Ian Young
My VMs can reach the rest of our internal network and even the internet but
nothing except the management/hypervisor can reach the VMs.  I monitored
eth0 on one of the VMs while I tried to SSH to it from another workstation
and it displayed this:

17:05:29.031584 ARP, Request who-has monitor.cs1cloud.internal tell
192.168.100.166, length 46

I have the network bridge set up correctly and I've tried disabling
iptables and SELinux just to rule those out.  There must be something
simple I overlooked.  Why does outbound traffic work but inbound traffic
doesn't?


Re: cloudstack 4.3 installation on CentOS

2014-05-16 Thread Ian Young
That quick start guide used to have the wrong URL for the system VM
template but it looks like it has been corrected since then.  Check your
command line history and see if the template you downloaded was the same as
the one in this section:

http://cloudstack-installation.readthedocs.org/en/latest/qig.html#system-template-setup


On Wed, May 7, 2014 at 10:26 AM, dimas yoga pratama wrote:

> yes I follow the wizard.
> This is my configuration with basic installation.
>
> management server IP : 10.151.32.51
>
> Zone Configuration:
>
> Name : Zone1
> Public DNS1: 202.46.129.2 (College DNS)
> Public DNS2: -
> Internal DNS1: 10.151.32.6 (My Lab DNS)
> Internal DNS2: -
>
> Pod Configuration:
>
> Name : Pod1
> Gateway : 10.151.32.1
> Netmask : 255.255.255.0
> Start/end reserved system IPs : 10.151.32.60 - 10.151.32.80
> Guest Gateway : 10.151.32.1
> Guest Netmask : 255.255.255.0
> Guest start/end IP : 10.151.32.90 - 10.151.32.200
>
>
> Cluster Configuration:
>
> Name: Cluster1
> Hypervisor:KVM
>
>
> Host Configuration:
>
> Hostname:10.151.32.51 (Because I build cloudstack with single hardware)
> Username:root
> Password:password
>
> Primary storage:
>
> Name:  Primary1
> Server: 10.151.32.51
> Path :/primary
>
> Secondary storage:
>
> NFS server: 10.151.32.51
> Path :/secondary
>
>
> Is there anything wrong with my configurations?
> I wanna test cloudstack in my Lab environment.
> Thanks.
>
>
> On Wed, May 7, 2014 at 6:31 AM, Pierre-Luc Dion 
> wrote:
>
> > Hi,
> > Did you follow the zone creation wizard from the ui?
> >
> > Could it be possible their is not enough management IP in the pod?
> >
> > Le mardi 6 mai 2014, dimas yoga pratama  a écrit :
> >
> > > Hi all,
> > >
> > > I'm trying to install cloudstack 4.3 on CentOS 6.5 with single hardware
> > > with proxy environment. I followed this guide
> > > http://cloudstack-installation.readthedocs.org/en/latest/qig.html. I
> > > managed to login to  cloudstack dashboard and succeded to add host.
> When
> > it
> > > comes to "creating system VMs(this may take a while)" step, it takes a
> > very
> > > very loong time and when I refresh the browser suddenly it redirect to
> > > dashboard. I check the infrastructure tab and I found 2 system VMs
> > already
> > > created and the VM state showed "starting" but the agent showed
> nothing.
> > > Also in dashboard tab I found this notification : "Management Server:
> > > Management network CIDR is not configured original.type 14"
> > > why is that happening?  anything wrong with my installation?
> > >
> > > looking forward for your answer.
> > >
> >
> >
> > --
> >
> > Pierre-Luc Dion
> > Architecte de Solution Cloud | Cloud Solutions Architect
> > 855-OK-CLOUD (855-652-5683) x1101
> > - - -
> >
> > *CloudOps*420 rue Guy
> > Montréal QC  H3J 1S6
> > www.cloudops.com
> > @CloudOps_
> >
>


Re: Quick Installation Guide for CentOS 6.5

2014-05-16 Thread Ian Young
Two things to check:

That quick start guide used to have the wrong URL for the system VM
template but it looks like it has been corrected since then.  Check your
command line history and see if the template you downloaded was the same as
the one in this section:

http://cloudstack-installation.readthedocs.org/en/latest/qig.html#system-template-setup

For NFSv4 you need to export a pseudo file system designated by fsid=0,
which this guide doesn't mention.  See section "18.7.1.1. Using exportfs
with NFSv4" in the following article for more information:

http://www.centos.org/docs/5/html/Deployment_Guide-en-US/s1-nfs-server-config-exports.html



On Tue, May 6, 2014 at 5:33 AM, dimas  wrote:

> Samuel Winchenbach  writes:
>
> >
> > Hi all,
> >
> > I am following the quick installation guide exactly (I even setup a
> gateway
> > w/ IP 172.16.10.1) but I can not get the setup to complete.  The System
> VMs
> > remain stuck on "Starting".
> > ​  Any help would be greatly appreciated!​
> >
> > Now I logged into the management console and used the exact settings as
> > listed in the "Quick Install Guide".   It seems to hang forever (>2
> hours)
> > on "Creating system VMs (this may take a while)"
> > ​
>
>
> Hi I encountered same problem like you. did you managed to find a solution?
> looking forward for your answer
>
>
>
>


Re: replacement for realhostip

2014-05-16 Thread Ian Young
I just realized I had to set the consoleproxy.url.domain field to "
realhostip.com" but now when I try to view the console, the browser says
"The server refused the connection."  Does that indicate a problem with the
SSL certificate?

management-server.log:
2014-05-15 14:43:55,506 DEBUG [c.c.a.t.Request] (catalina-exec-15:null) Seq
1-90898443: Sending  { Cmd , MgmtId: 161342909744, via: 1(
virthost1.lax.ratespecial.com), Ver: v1, Flags: 100011,
[{"com.cloud.agent.api.GetVncPortCommand":{"id":2,"name":"v-2-VM","wait":0}}]
}
2014-05-15 14:43:55,563 DEBUG [c.c.a.t.Request]
(AgentManager-Handler-5:null) Seq 1-90898443: Processing:  { Ans: , MgmtId:
161342909744, via: 1, Ver: v1, Flags: 10,
[{"com.cloud.agent.api.GetVncPortAnswer":{"address":"192.168.100.6","port":5901,"result":true,"wait":0}}]
}
2014-05-15 14:43:55,563 DEBUG [c.c.a.t.Request] (catalina-exec-15:null) Seq
1-90898443: Received:  { Ans: , MgmtId: 161342909744, via: 1, Ver: v1,
Flags: 10, { GetVncPortAnswer } }
2014-05-15 14:43:55,563 DEBUG [c.c.s.ConsoleProxyServlet]
(catalina-exec-15:null) Port info 192.168.100.6
2014-05-15 14:43:55,563 INFO  [c.c.s.ConsoleProxyServlet]
(catalina-exec-15:null) Parse host info returned from executing
GetVNCPortCommand. host info: 192.168.100.6
2014-05-15 14:43:55,570 DEBUG [c.c.s.ConsoleProxyServlet]
(catalina-exec-15:null) Compose console url:
https://192-168-100-159.realhostip.com/ajax?token=CsPhU4m_R2ZoLIdXOtjo3y3humnQN20wt5fSPjbZOHtRh7nli7tiq0ZiWUuwCVIn7K0vuegv6oMAAq_vDY4Vr_f7jwoVQDkxAE1vmK9oRhy9pvBVlmAdCer6hlVjXQlwL9oJEQO4thhSDg2qeNji02xuxlSmDilVKnd9U9xiHqIV-PgktrKq3J2GT1EpcpTvhsew5COQ1h3j8M9IM8KLZpYA0dDp7TejMmfgSiQI8ifZSh_nNLyyqBzYvl1XWxSaDIrnj7UsP3JKUq74kdY5Pg
2014-05-15 14:43:55,570 DEBUG [c.c.s.ConsoleProxyServlet]
(catalina-exec-15:null) the console url is ::
v-2-VMhttps://192-168-100-159.realhostip.com/ajax?token=CsPhU4m_R2ZoLIdXOtjo3y3humnQN20wt5fSPjbZOHtRh7nli7tiq0ZiWUuwCVIn7K0vuegv6oMAAq_vDY4Vr_f7jwoVQDkxAE1vmK9oRhy9pvBVlmAdCer6hlVjXQlwL9oJEQO4thhSDg2qeNji02xuxlSmDilVKnd9U9xiHqIV-PgktrKq3J2GT1EpcpTvhsew5COQ1h3j8M9IM8KLZpYA0dDp7TejMmfgSiQI8ifZSh_nNLyyqBzYvl1XWxSaDIrnj7UsP3JKUq74kdY5Pg
">

ssl_access_log:
192.168.100.166 - - [15/May/2014:14:44:55 -0700] "GET
/client/console?cmd=access&vm=086b5822-de00-4764-8b05-d8e00657ee54
HTTP/1.1" 200 405


On Wed, May 14, 2014 at 5:56 PM, Ian Young  wrote:

> Looks like it's still using HTTP, not HTTPS:
>
> 2014-05-14 17:52:35,812 DEBUG [c.c.a.t.Request] (catalina-exec-20:null)
> Seq 1-800529939: Sending  { Cmd , MgmtId: 161342909744, via: 1(
> virthost1.lax.ratespecial.com), Ver: v1, Flags: 100011,
> [{"com.cloud.agent.api.GetVncPortCommand":{"id":6,"name":"i-5-6-VM","wait":0}}]
> }
> 2014-05-14 17:52:35,861 DEBUG [c.c.a.t.Request]
> (AgentManager-Handler-1:null) Seq 1-800529939: Processing:  { Ans: ,
> MgmtId: 161342909744, via: 1, Ver: v1, Flags: 10,
> [{"com.cloud.agent.api.GetVncPortAnswer":{"address":"192.168.100.6","port":5903,"result":true,"wait":0}}]
> }
> 2014-05-14 17:52:35,861 DEBUG [c.c.a.t.Request] (catalina-exec-20:null)
> Seq 1-800529939: Received:  { Ans: , MgmtId: 161342909744, via: 1, Ver: v1,
> Flags: 10, { GetVncPortAnswer } }
> 2014-05-14 17:52:35,861 DEBUG [c.c.s.ConsoleProxyServlet]
> (catalina-exec-20:null) Port info 192.168.100.6
> 2014-05-14 17:52:35,861 INFO  [c.c.s.ConsoleProxyServlet]
> (catalina-exec-20:null) Parse host info returned from executing
> GetVNCPortCommand. host info: 192.168.100.6
> 2014-05-14 17:52:35,865 DEBUG [c.c.s.ConsoleProxyServlet]
> (catalina-exec-20:null) Compose console url:
> http://192.168.100.159/ajax?token=CsPhU4m_R2ZoLIdXOtjo3y3humnQN20wt5fSPjbZOHtRh7nli7tiq0ZiWUuwCVIn_GSECIK5nC2lBX8cMHvt1_GrmwDVK1PEEAwyueLlgNRgodobz8Lsyv2jEc-mUvMH340AYGt0FyZOuXIA6dunN3yx-bP-vp4rao5Up61eJwOvqFr3PhggNpbq5Up59ObOdYMe2GsBP_3FrL8ZQfBhNBSmViHQ0fKJSyUHDoC9tKlfs2Bb0rPOBxsZeTPfe-hDuaVT-pZxjQXCKM93sujnWw
> 2014-05-14 17:52:35,865 DEBUG [c.c.s.ConsoleProxyServlet]
> (catalina-exec-20:null) the console url is ::
> phonesynergyhttp://192.168.100.159/ajax?token=CsPhU4m_R2ZoLIdXOtjo3y3humnQN20wt5fSPjbZOHtRh7nli7tiq0ZiWUuwCVIn_GSECIK5nC2lBX8cMHvt1_GrmwDVK1PEEAwyueLlgNRgodobz8Lsyv2jEc-mUvMH340AYGt0FyZOuXIA6dunN3yx-bP-vp4rao5Up61eJwOvqFr3PhggNpbq5Up59ObOdYMe2GsBP_3FrL8ZQfBhNBSmViHQ0fKJSyUHDoC9tKlfs2Bb0rPOBxsZeTPfe-hDuaVT-pZxjQXCKM93sujnWw
> ">
>
>
> On Wed, May 14, 2014 at 5:41 PM, Ian Young  wrote:
>
>> I decided to create my own internal realhostip.com.  My DNS servers use
>> PowerDNS, not BIND, so the $GENERATE directive was not an option and I
>> didn't want to have to populate my DNS servers' databases with a record for
>> every possible 

Re: cloudstack 4.3 installation on CentOS

2014-05-16 Thread Ian Young
Also, which version of NFS are you using?  For NFSv4 you need to export a
pseudo file system designated by fsid=0, which this guide doesn't mention.
 See section "18.7.1.1. Using exportfs with NFSv4" in the following article
for more information:

http://www.centos.org/docs/5/html/Deployment_Guide-en-US/s1-nfs-server-config-exports.html


On Thu, May 15, 2014 at 10:37 AM, Ian Young  wrote:

> That quick start guide used to have the wrong URL for the system VM
> template but it looks like it has been corrected since then.  Check your
> command line history and see if the template you downloaded was the same as
> the one in this section:
>
>
> http://cloudstack-installation.readthedocs.org/en/latest/qig.html#system-template-setup
>
>
> On Wed, May 7, 2014 at 10:26 AM, dimas yoga pratama wrote:
>
>> yes I follow the wizard.
>> This is my configuration with basic installation.
>>
>> management server IP : 10.151.32.51
>>
>> Zone Configuration:
>>
>> Name : Zone1
>> Public DNS1: 202.46.129.2 (College DNS)
>> Public DNS2: -
>> Internal DNS1: 10.151.32.6 (My Lab DNS)
>> Internal DNS2: -
>>
>> Pod Configuration:
>>
>> Name : Pod1
>> Gateway : 10.151.32.1
>> Netmask : 255.255.255.0
>> Start/end reserved system IPs : 10.151.32.60 - 10.151.32.80
>> Guest Gateway : 10.151.32.1
>> Guest Netmask : 255.255.255.0
>> Guest start/end IP : 10.151.32.90 - 10.151.32.200
>>
>>
>> Cluster Configuration:
>>
>> Name: Cluster1
>> Hypervisor:KVM
>>
>>
>> Host Configuration:
>>
>> Hostname:10.151.32.51 (Because I build cloudstack with single hardware)
>> Username:root
>> Password:password
>>
>> Primary storage:
>>
>> Name:  Primary1
>> Server: 10.151.32.51
>> Path :/primary
>>
>> Secondary storage:
>>
>> NFS server: 10.151.32.51
>> Path :/secondary
>>
>>
>> Is there anything wrong with my configurations?
>> I wanna test cloudstack in my Lab environment.
>> Thanks.
>>
>>
>> On Wed, May 7, 2014 at 6:31 AM, Pierre-Luc Dion 
>> wrote:
>>
>> > Hi,
>> > Did you follow the zone creation wizard from the ui?
>> >
>> > Could it be possible their is not enough management IP in the pod?
>> >
>> > Le mardi 6 mai 2014, dimas yoga pratama  a écrit :
>> >
>> > > Hi all,
>> > >
>> > > I'm trying to install cloudstack 4.3 on CentOS 6.5 with single
>> hardware
>> > > with proxy environment. I followed this guide
>> > > http://cloudstack-installation.readthedocs.org/en/latest/qig.html. I
>> > > managed to login to  cloudstack dashboard and succeded to add host.
>> When
>> > it
>> > > comes to "creating system VMs(this may take a while)" step, it takes a
>> > very
>> > > very loong time and when I refresh the browser suddenly it redirect to
>> > > dashboard. I check the infrastructure tab and I found 2 system VMs
>> > already
>> > > created and the VM state showed "starting" but the agent showed
>> nothing.
>> > > Also in dashboard tab I found this notification : "Management Server:
>> > > Management network CIDR is not configured original.type 14"
>> > > why is that happening?  anything wrong with my installation?
>> > >
>> > > looking forward for your answer.
>> > >
>> >
>> >
>> > --
>> >
>> > Pierre-Luc Dion
>> > Architecte de Solution Cloud | Cloud Solutions Architect
>> > 855-OK-CLOUD (855-652-5683) x1101
>> > - - -
>> >
>> > *CloudOps*420 rue Guy
>> > Montréal QC  H3J 1S6
>> > www.cloudops.com
>> > @CloudOps_
>> >
>>
>
>


Re: replacement for realhostip

2014-05-16 Thread Ian Young
I was able to confirm the certificate by going directly to
https://192-168-100-159.realhostip.com/ in the browser.  I wish there was
an easier way to do this.  I don't mind the extra step, and the rest of my
tech team will understand how it works but it's going to be a hassle
explaining this procedure to everyone else.  I really hope someone can
think of a more elegant alternative to realhostip.com when 4.4 is released.
 It will lead to better product adoption.


On Fri, May 16, 2014 at 10:49 AM, Ian Young  wrote:

> Ok, so the console proxy needed to be restarted in order for the 
> consoleproxy.url.domain
> setting to take effect.  However, I still can't see the console.  In
> Chrome, it just shows a frowning face with no error message (not very
> useful).  In Firefox, at least it tells me the certificate is not trusted
> because it is self-signed but it doesn't give me the option to accept it.
>
> It's not an unreasonable expectation to be able to use self-signed SSL
> certificates for an internal site.  Is there a setting in CloudStack that
> allows them to be trusted?
>
>
> On Fri, May 16, 2014 at 10:38 AM, Ian Young wrote:
>
>> The problem appears to be with the console proxy itself.  Here are the
>> ports that are listening on the public interface, according to an nmap TCP
>> scan:
>>
>> PORTSTATE  SERVICE
>> 80/tcp  open   http
>> 443/tcp closed https
>>
>> When I logged into the console proxy through the link local address, I
>> checked for processes on port 443 and there are none, so obviously an HTTPS
>> connection can't be made.  There is a Java process listening on port 80 but
>> nothing on 443.  Is there something in the global settings that will enable
>> HTTPS, or is this a bug?
>>
>> root@v-2-VM:~# netstat -lnp | grep java
>> tcp0  0 0.0.0.0:8001        0.0.0.0:*
>> LISTEN  3491/java
>> tcp0  0 0.0.0.0:80  0.0.0.0:*
>> LISTEN  3491/java
>>
>>
>> On Thu, May 15, 2014 at 2:53 PM, Ian Young wrote:
>>
>>> I just realized I had to set the consoleproxy.url.domain field to "
>>> realhostip.com" but now when I try to view the console, the browser
>>> says "The server refused the connection."  Does that indicate a problem
>>> with the SSL certificate?
>>>
>>> management-server.log:
>>> 2014-05-15 14:43:55,506 DEBUG [c.c.a.t.Request] (catalina-exec-15:null)
>>> Seq 1-90898443: Sending  { Cmd , MgmtId: 161342909744, via: 1(
>>> virthost1.lax.ratespecial.com), Ver: v1, Flags: 100011,
>>> [{"com.cloud.agent.api.GetVncPortCommand":{"id":2,"name":"v-2-VM","wait":0}}]
>>> }
>>> 2014-05-15 14:43:55,563 DEBUG [c.c.a.t.Request]
>>> (AgentManager-Handler-5:null) Seq 1-90898443: Processing:  { Ans: , MgmtId:
>>> 161342909744, via: 1, Ver: v1, Flags: 10,
>>> [{"com.cloud.agent.api.GetVncPortAnswer":{"address":"192.168.100.6","port":5901,"result":true,"wait":0}}]
>>> }
>>> 2014-05-15 14:43:55,563 DEBUG [c.c.a.t.Request] (catalina-exec-15:null)
>>> Seq 1-90898443: Received:  { Ans: , MgmtId: 161342909744, via: 1, Ver: v1,
>>> Flags: 10, { GetVncPortAnswer } }
>>> 2014-05-15 14:43:55,563 DEBUG [c.c.s.ConsoleProxyServlet]
>>> (catalina-exec-15:null) Port info 192.168.100.6
>>> 2014-05-15 14:43:55,563 INFO  [c.c.s.ConsoleProxyServlet]
>>> (catalina-exec-15:null) Parse host info returned from executing
>>> GetVNCPortCommand. host info: 192.168.100.6
>>> 2014-05-15 14:43:55,570 DEBUG [c.c.s.ConsoleProxyServlet]
>>> (catalina-exec-15:null) Compose console url:
>>> https://192-168-100-159.realhostip.com/ajax?token=CsPhU4m_R2ZoLIdXOtjo3y3humnQN20wt5fSPjbZOHtRh7nli7tiq0ZiWUuwCVIn7K0vuegv6oMAAq_vDY4Vr_f7jwoVQDkxAE1vmK9oRhy9pvBVlmAdCer6hlVjXQlwL9oJEQO4thhSDg2qeNji02xuxlSmDilVKnd9U9xiHqIV-PgktrKq3J2GT1EpcpTvhsew5COQ1h3j8M9IM8KLZpYA0dDp7TejMmfgSiQI8ifZSh_nNLyyqBzYvl1XWxSaDIrnj7UsP3JKUq74kdY5Pg
>>> 2014-05-15 14:43:55,570 DEBUG [c.c.s.ConsoleProxyServlet]
>>> (catalina-exec-15:null) the console url is ::
>>> v-2-VMhttps://192-168-100-159.realhostip.com/ajax?token=CsPhU4m_R2ZoLIdXOtjo3y3humnQN20wt5fSPjbZOHtRh7nli7tiq0ZiWUuwCVIn7K0vuegv6oMAAq_vDY4Vr_f7jwoVQDkxAE1vmK9oRhy9pvBVlmAdCer6hlVjXQlwL9oJEQO4thhSDg2qeNji02xuxlSmDilVKnd9U9xiHqIV-PgktrKq3J2GT1EpcpTvhsew5COQ1h3j8M9IM8KLZpYA0dDp7TejMmfgSiQI8ifZSh_nNLyyqBzYvl1XWxSaDIrnj7UsP3JKUq74kdY5Pg
>>> ">
>>>
>>> ssl_access_log:
>>> 192.168.100.166 - - [15/May/2014:14:44:55 -0700] &q

Re: replacement for realhostip

2014-05-16 Thread Ian Young
4.3
consoleproxy.url.domain = realhostip.com

It's working now.  I'm just responding to clarify those questions.


On Thu, May 15, 2014 at 10:43 AM, Amogh Vasekar wrote:

> Hi,
>
> Which version of CloudStack are you on?
> Also, what does the config "console proxy.url.domain" refer to?
>
> Thanks,
> Amogh
>
> On 5/14/14 5:41 PM, "Ian Young"  wrote:
>
> >I decided to create my own internal realhostip.com.  My DNS servers use
> >PowerDNS, not BIND, so the $GENERATE directive was not an option and I
> >didn't want to have to populate my DNS servers' databases with a record
> >for
> >every possible IP address.  Fortunately, I found the following Lua script:
> >
> >https://github.com/terbolous/powerdns-cloudstack-proxy-dns
> >
> >I can confirm the Lua script works as expected and my CloudStack server
> >can
> >be tricked into believing my internal DNS servers are the authority for
> >realhostip.com:
> >
> >[root@virthost1 ]# dig +short 1-2-3-4.realhostip.com
> >1.2.3.4
> >
> >I followed this guide and updated the console proxy/SSVM SSL certificate
> >with my own *.realhostip.com certificate.
> >
> >
> http://docs.cloudstack.apache.org/projects/cloudstack-administration/en/la
> >test/systemvm.html#changing-the-console-proxy-ssl-certificate-and-domain
> >
> >The console proxy restarted but it's still blank when I try to view the
> >console.  Does the domain have to be something other than realhostip.com?
>
>


Re: replacement for realhostip

2014-05-16 Thread Ian Young
Ok, so the console proxy needed to be restarted in order for the
consoleproxy.url.domain
setting to take effect.  However, I still can't see the console.  In
Chrome, it just shows a frowning face with no error message (not very
useful).  In Firefox, at least it tells me the certificate is not trusted
because it is self-signed but it doesn't give me the option to accept it.

It's not an unreasonable expectation to be able to use self-signed SSL
certificates for an internal site.  Is there a setting in CloudStack that
allows them to be trusted?


On Fri, May 16, 2014 at 10:38 AM, Ian Young  wrote:

> The problem appears to be with the console proxy itself.  Here are the
> ports that are listening on the public interface, according to an nmap TCP
> scan:
>
> PORTSTATE  SERVICE
> 80/tcp  open   http
> 443/tcp closed https
>
> When I logged into the console proxy through the link local address, I
> checked for processes on port 443 and there are none, so obviously an HTTPS
> connection can't be made.  There is a Java process listening on port 80 but
> nothing on 443.  Is there something in the global settings that will enable
> HTTPS, or is this a bug?
>
> root@v-2-VM:~# netstat -lnp | grep java
> tcp0  0 0.0.0.0:80010.0.0.0:*
> LISTEN  3491/java
> tcp0  0 0.0.0.0:80  0.0.0.0:*
> LISTEN  3491/java
>
>
> On Thu, May 15, 2014 at 2:53 PM, Ian Young  wrote:
>
>> I just realized I had to set the consoleproxy.url.domain field to "
>> realhostip.com" but now when I try to view the console, the browser says
>> "The server refused the connection."  Does that indicate a problem with the
>> SSL certificate?
>>
>> management-server.log:
>> 2014-05-15 14:43:55,506 DEBUG [c.c.a.t.Request] (catalina-exec-15:null)
>> Seq 1-90898443: Sending  { Cmd , MgmtId: 161342909744, via: 1(
>> virthost1.lax.ratespecial.com), Ver: v1, Flags: 100011,
>> [{"com.cloud.agent.api.GetVncPortCommand":{"id":2,"name":"v-2-VM","wait":0}}]
>> }
>> 2014-05-15 14:43:55,563 DEBUG [c.c.a.t.Request]
>> (AgentManager-Handler-5:null) Seq 1-90898443: Processing:  { Ans: , MgmtId:
>> 161342909744, via: 1, Ver: v1, Flags: 10,
>> [{"com.cloud.agent.api.GetVncPortAnswer":{"address":"192.168.100.6","port":5901,"result":true,"wait":0}}]
>> }
>> 2014-05-15 14:43:55,563 DEBUG [c.c.a.t.Request] (catalina-exec-15:null)
>> Seq 1-90898443: Received:  { Ans: , MgmtId: 161342909744, via: 1, Ver: v1,
>> Flags: 10, { GetVncPortAnswer } }
>> 2014-05-15 14:43:55,563 DEBUG [c.c.s.ConsoleProxyServlet]
>> (catalina-exec-15:null) Port info 192.168.100.6
>> 2014-05-15 14:43:55,563 INFO  [c.c.s.ConsoleProxyServlet]
>> (catalina-exec-15:null) Parse host info returned from executing
>> GetVNCPortCommand. host info: 192.168.100.6
>> 2014-05-15 14:43:55,570 DEBUG [c.c.s.ConsoleProxyServlet]
>> (catalina-exec-15:null) Compose console url:
>> https://192-168-100-159.realhostip.com/ajax?token=CsPhU4m_R2ZoLIdXOtjo3y3humnQN20wt5fSPjbZOHtRh7nli7tiq0ZiWUuwCVIn7K0vuegv6oMAAq_vDY4Vr_f7jwoVQDkxAE1vmK9oRhy9pvBVlmAdCer6hlVjXQlwL9oJEQO4thhSDg2qeNji02xuxlSmDilVKnd9U9xiHqIV-PgktrKq3J2GT1EpcpTvhsew5COQ1h3j8M9IM8KLZpYA0dDp7TejMmfgSiQI8ifZSh_nNLyyqBzYvl1XWxSaDIrnj7UsP3JKUq74kdY5Pg
>> 2014-05-15 14:43:55,570 DEBUG [c.c.s.ConsoleProxyServlet]
>> (catalina-exec-15:null) the console url is ::
>> v-2-VMhttps://192-168-100-159.realhostip.com/ajax?token=CsPhU4m_R2ZoLIdXOtjo3y3humnQN20wt5fSPjbZOHtRh7nli7tiq0ZiWUuwCVIn7K0vuegv6oMAAq_vDY4Vr_f7jwoVQDkxAE1vmK9oRhy9pvBVlmAdCer6hlVjXQlwL9oJEQO4thhSDg2qeNji02xuxlSmDilVKnd9U9xiHqIV-PgktrKq3J2GT1EpcpTvhsew5COQ1h3j8M9IM8KLZpYA0dDp7TejMmfgSiQI8ifZSh_nNLyyqBzYvl1XWxSaDIrnj7UsP3JKUq74kdY5Pg
>> ">
>>
>> ssl_access_log:
>> 192.168.100.166 - - [15/May/2014:14:44:55 -0700] "GET
>> /client/console?cmd=access&vm=086b5822-de00-4764-8b05-d8e00657ee54
>> HTTP/1.1" 200 405
>>
>>
>> On Wed, May 14, 2014 at 5:56 PM, Ian Young wrote:
>>
>>> Looks like it's still using HTTP, not HTTPS:
>>>
>>> 2014-05-14 17:52:35,812 DEBUG [c.c.a.t.Request] (catalina-exec-20:null)
>>> Seq 1-800529939: Sending  { Cmd , MgmtId: 161342909744, via: 1(
>>> virthost1.lax.ratespecial.com), Ver: v1, Flags: 100011,
>>> [{"com.cloud.agent.api.GetVncPortCommand":{"id":6,"name":"i-5-6-VM","wait":0}}]
>>> }
>>> 2014-05-14 17:52:35,861 DEBUG [c.c.a.t.Request]
>>> (AgentManager-Handler-1:null) Seq 1-800529939: Processing:  { Ans: ,
>>> MgmtI

Re: replacement for realhostip

2014-05-16 Thread Ian Young
The problem appears to be with the console proxy itself.  Here are the
ports that are listening on the public interface, according to an nmap TCP
scan:

PORTSTATE  SERVICE
80/tcp  open   http
443/tcp closed https

When I logged into the console proxy through the link local address, I
checked for processes on port 443 and there are none, so obviously an HTTPS
connection can't be made.  There is a Java process listening on port 80 but
nothing on 443.  Is there something in the global settings that will enable
HTTPS, or is this a bug?

root@v-2-VM:~# netstat -lnp | grep java
tcp0  0 0.0.0.0:80010.0.0.0:*   LISTEN
 3491/java
tcp0  0 0.0.0.0:80  0.0.0.0:*   LISTEN
 3491/java


On Thu, May 15, 2014 at 2:53 PM, Ian Young  wrote:

> I just realized I had to set the consoleproxy.url.domain field to "
> realhostip.com" but now when I try to view the console, the browser says
> "The server refused the connection."  Does that indicate a problem with the
> SSL certificate?
>
> management-server.log:
> 2014-05-15 14:43:55,506 DEBUG [c.c.a.t.Request] (catalina-exec-15:null)
> Seq 1-90898443: Sending  { Cmd , MgmtId: 161342909744, via: 1(
> virthost1.lax.ratespecial.com), Ver: v1, Flags: 100011,
> [{"com.cloud.agent.api.GetVncPortCommand":{"id":2,"name":"v-2-VM","wait":0}}]
> }
> 2014-05-15 14:43:55,563 DEBUG [c.c.a.t.Request]
> (AgentManager-Handler-5:null) Seq 1-90898443: Processing:  { Ans: , MgmtId:
> 161342909744, via: 1, Ver: v1, Flags: 10,
> [{"com.cloud.agent.api.GetVncPortAnswer":{"address":"192.168.100.6","port":5901,"result":true,"wait":0}}]
> }
> 2014-05-15 14:43:55,563 DEBUG [c.c.a.t.Request] (catalina-exec-15:null)
> Seq 1-90898443: Received:  { Ans: , MgmtId: 161342909744, via: 1, Ver: v1,
> Flags: 10, { GetVncPortAnswer } }
> 2014-05-15 14:43:55,563 DEBUG [c.c.s.ConsoleProxyServlet]
> (catalina-exec-15:null) Port info 192.168.100.6
> 2014-05-15 14:43:55,563 INFO  [c.c.s.ConsoleProxyServlet]
> (catalina-exec-15:null) Parse host info returned from executing
> GetVNCPortCommand. host info: 192.168.100.6
> 2014-05-15 14:43:55,570 DEBUG [c.c.s.ConsoleProxyServlet]
> (catalina-exec-15:null) Compose console url:
> https://192-168-100-159.realhostip.com/ajax?token=CsPhU4m_R2ZoLIdXOtjo3y3humnQN20wt5fSPjbZOHtRh7nli7tiq0ZiWUuwCVIn7K0vuegv6oMAAq_vDY4Vr_f7jwoVQDkxAE1vmK9oRhy9pvBVlmAdCer6hlVjXQlwL9oJEQO4thhSDg2qeNji02xuxlSmDilVKnd9U9xiHqIV-PgktrKq3J2GT1EpcpTvhsew5COQ1h3j8M9IM8KLZpYA0dDp7TejMmfgSiQI8ifZSh_nNLyyqBzYvl1XWxSaDIrnj7UsP3JKUq74kdY5Pg
> 2014-05-15 14:43:55,570 DEBUG [c.c.s.ConsoleProxyServlet]
> (catalina-exec-15:null) the console url is ::
> v-2-VMhttps://192-168-100-159.realhostip.com/ajax?token=CsPhU4m_R2ZoLIdXOtjo3y3humnQN20wt5fSPjbZOHtRh7nli7tiq0ZiWUuwCVIn7K0vuegv6oMAAq_vDY4Vr_f7jwoVQDkxAE1vmK9oRhy9pvBVlmAdCer6hlVjXQlwL9oJEQO4thhSDg2qeNji02xuxlSmDilVKnd9U9xiHqIV-PgktrKq3J2GT1EpcpTvhsew5COQ1h3j8M9IM8KLZpYA0dDp7TejMmfgSiQI8ifZSh_nNLyyqBzYvl1XWxSaDIrnj7UsP3JKUq74kdY5Pg
> ">
>
> ssl_access_log:
> 192.168.100.166 - - [15/May/2014:14:44:55 -0700] "GET
> /client/console?cmd=access&vm=086b5822-de00-4764-8b05-d8e00657ee54
> HTTP/1.1" 200 405
>
>
> On Wed, May 14, 2014 at 5:56 PM, Ian Young  wrote:
>
>> Looks like it's still using HTTP, not HTTPS:
>>
>> 2014-05-14 17:52:35,812 DEBUG [c.c.a.t.Request] (catalina-exec-20:null)
>> Seq 1-800529939: Sending  { Cmd , MgmtId: 161342909744, via: 1(
>> virthost1.lax.ratespecial.com), Ver: v1, Flags: 100011,
>> [{"com.cloud.agent.api.GetVncPortCommand":{"id":6,"name":"i-5-6-VM","wait":0}}]
>> }
>> 2014-05-14 17:52:35,861 DEBUG [c.c.a.t.Request]
>> (AgentManager-Handler-1:null) Seq 1-800529939: Processing:  { Ans: ,
>> MgmtId: 161342909744, via: 1, Ver: v1, Flags: 10,
>> [{"com.cloud.agent.api.GetVncPortAnswer":{"address":"192.168.100.6","port":5903,"result":true,"wait":0}}]
>> }
>> 2014-05-14 17:52:35,861 DEBUG [c.c.a.t.Request] (catalina-exec-20:null)
>> Seq 1-800529939: Received:  { Ans: , MgmtId: 161342909744, via: 1, Ver: v1,
>> Flags: 10, { GetVncPortAnswer } }
>> 2014-05-14 17:52:35,861 DEBUG [c.c.s.ConsoleProxyServlet]
>> (catalina-exec-20:null) Port info 192.168.100.6
>> 2014-05-14 17:52:35,861 INFO  [c.c.s.ConsoleProxyServlet]
>> (catalina-exec-20:null) Parse host info returned from executing
>> GetVNCPortCommand. host info: 192.168.100.6
>> 2014-05-14 17:52:35,865 DEBUG [c.c.s.ConsoleProxyServlet]
>

Re: new installation--ssvm won't start

2014-05-15 Thread Ian Young
I wiped the server clean and started over again today.  In the process, I
realized that, the previous time, I forgot to uncomment the Domain line in
/etc/idmapd.conf.  However, even though I included the step this time, the
GUI installer still seems to hang on the final "Creating system VMs" step.
 I see two VMs running when I run "virsh list" (the secondary storage VM
keeps getting regenerated).  In the primary storage, it looks like there is
one complete 693 MB image but the other two are only 11 and 12 MB, although
they are gradually growing.  What's happening here?

[root@virthost1 ~]# ls -hl /var/primary/
total 715M
-rwxr--r--. 1 nobody nobody  11M May  8 09:55
54de167f-ad9c-453b-91c7-fdd644922932
-rwxr--r--. 1 nobody nobody  12M May  8 09:55
91069b66-b1b3-41aa-8995-874fd4353473
-rwxr--r--. 1 nobody nobody 693M May  8 09:16
c2e6efba-d6c7-11e3-9e76-002590c96d30

The management server log keeps reporting that "There is no secondary
storage VM for secondary storage host nfs://192.168.100.6/var/secondary."
 Here is a larger section of logs:
http://pastebin.com/NFf5cBx3


On Wed, May 7, 2014 at 10:49 AM, Ian Young  wrote:

> I noticed that in Home > Infrastructure > Zones > Zone1, Resources tab,
> the Secondary Storage says "Allocated 0.00 KB / 0.00 KB".  However, the
> secondary storage NFS mount is listed in Home > Infrastructure > Secondary
> Storage and the URL is correct.  Does this mean the secondary storage is
> unreachable?
>
>
> On Wed, May 7, 2014 at 10:26 AM, Ian Young  wrote:
>
>> I reinstalled my single server CloudStack system yesterday, following the
>> quick start guide precisely.  The only difference was that I used
>> /var/primary and /var/secondary instead of /primary and /secondary, because
>> the /var partition on this machine is very large.  The UI installer reached
>> the point where it says "Creating system VMs (this may take a while)" but
>> never finished.  I left it overnight and it still hadn't completed.  This
>> is typically the step that fails, most of the times I've installed
>> CloudStack, so I imagine I must be making the same fundamental mistake each
>> time, and I'd like to know what that is.
>>
>> I checked management.log and it's in a loop where it creates a secondary
>> storage VM, fails to start it, destroys it, and tries again.  It says Host
>> 1 is unreachable but I'm using the correct password, SELinux is permissive,
>> and all the iptables rules are in place.  In what way is it trying to
>> connect to Host 1?  SSH?  NFS?  Here's a log excerpt of messages related to
>> the SSVM:
>>
>> http://pastebin.com/X11A51bh
>>
>> NFS appears to be functional, since CloudStack automatically mounted the
>> primary storage.
>>
>> FilesystemSize  Used Avail Use% Mounted on
>> /dev/sda3  20G  1.8G   17G  10% /
>> tmpfs  32G 0   32G   0% /dev/shm
>> /dev/sda1 194M   42M  143M  23% /boot
>> /dev/sda4 1.8T  1.9G  1.7T   1% /var
>> 192.168.100.6:/var/primary
>>   1.8T  1.9G  1.7T   1%
>> /mnt/0594caa2-ceb4-36c6-9b13-0ff149a130af
>>
>> How can I identify whatever it is that's preventing the SSVM from
>> starting?  Here is another log excerpt, without any filtering:
>>
>> http://pastebin.com/XsPGJQik
>>
>
>


Re: replacement for realhostip

2014-05-15 Thread Ian Young
Looks like it's still using HTTP, not HTTPS:

2014-05-14 17:52:35,812 DEBUG [c.c.a.t.Request] (catalina-exec-20:null) Seq
1-800529939: Sending  { Cmd , MgmtId: 161342909744, via: 1(
virthost1.lax.ratespecial.com), Ver: v1, Flags: 100011,
[{"com.cloud.agent.api.GetVncPortCommand":{"id":6,"name":"i-5-6-VM","wait":0}}]
}
2014-05-14 17:52:35,861 DEBUG [c.c.a.t.Request]
(AgentManager-Handler-1:null) Seq 1-800529939: Processing:  { Ans: ,
MgmtId: 161342909744, via: 1, Ver: v1, Flags: 10,
[{"com.cloud.agent.api.GetVncPortAnswer":{"address":"192.168.100.6","port":5903,"result":true,"wait":0}}]
}
2014-05-14 17:52:35,861 DEBUG [c.c.a.t.Request] (catalina-exec-20:null) Seq
1-800529939: Received:  { Ans: , MgmtId: 161342909744, via: 1, Ver: v1,
Flags: 10, { GetVncPortAnswer } }
2014-05-14 17:52:35,861 DEBUG [c.c.s.ConsoleProxyServlet]
(catalina-exec-20:null) Port info 192.168.100.6
2014-05-14 17:52:35,861 INFO  [c.c.s.ConsoleProxyServlet]
(catalina-exec-20:null) Parse host info returned from executing
GetVNCPortCommand. host info: 192.168.100.6
2014-05-14 17:52:35,865 DEBUG [c.c.s.ConsoleProxyServlet]
(catalina-exec-20:null) Compose console url:
http://192.168.100.159/ajax?token=CsPhU4m_R2ZoLIdXOtjo3y3humnQN20wt5fSPjbZOHtRh7nli7tiq0ZiWUuwCVIn_GSECIK5nC2lBX8cMHvt1_GrmwDVK1PEEAwyueLlgNRgodobz8Lsyv2jEc-mUvMH340AYGt0FyZOuXIA6dunN3yx-bP-vp4rao5Up61eJwOvqFr3PhggNpbq5Up59ObOdYMe2GsBP_3FrL8ZQfBhNBSmViHQ0fKJSyUHDoC9tKlfs2Bb0rPOBxsZeTPfe-hDuaVT-pZxjQXCKM93sujnWw
2014-05-14 17:52:35,865 DEBUG [c.c.s.ConsoleProxyServlet]
(catalina-exec-20:null) the console url is ::
phonesynergyhttp://192.168.100.159/ajax?token=CsPhU4m_R2ZoLIdXOtjo3y3humnQN20wt5fSPjbZOHtRh7nli7tiq0ZiWUuwCVIn_GSECIK5nC2lBX8cMHvt1_GrmwDVK1PEEAwyueLlgNRgodobz8Lsyv2jEc-mUvMH340AYGt0FyZOuXIA6dunN3yx-bP-vp4rao5Up61eJwOvqFr3PhggNpbq5Up59ObOdYMe2GsBP_3FrL8ZQfBhNBSmViHQ0fKJSyUHDoC9tKlfs2Bb0rPOBxsZeTPfe-hDuaVT-pZxjQXCKM93sujnWw
">


On Wed, May 14, 2014 at 5:41 PM, Ian Young  wrote:

> I decided to create my own internal realhostip.com.  My DNS servers use
> PowerDNS, not BIND, so the $GENERATE directive was not an option and I
> didn't want to have to populate my DNS servers' databases with a record for
> every possible IP address.  Fortunately, I found the following Lua script:
>
> https://github.com/terbolous/powerdns-cloudstack-proxy-dns
>
> I can confirm the Lua script works as expected and my CloudStack server
> can be tricked into believing my internal DNS servers are the authority for
> realhostip.com:
>
> [root@virthost1 ]# dig +short 1-2-3-4.realhostip.com
> 1.2.3.4
>
> I followed this guide and updated the console proxy/SSVM SSL certificate
> with my own *.realhostip.com certificate.
>
>
> http://docs.cloudstack.apache.org/projects/cloudstack-administration/en/latest/systemvm.html#changing-the-console-proxy-ssl-certificate-and-domain
>
> The console proxy restarted but it's still blank when I try to view the
> console.  Does the domain have to be something other than realhostip.com?
>


replacement for realhostip

2014-05-15 Thread Ian Young
I decided to create my own internal realhostip.com.  My DNS servers use
PowerDNS, not BIND, so the $GENERATE directive was not an option and I
didn't want to have to populate my DNS servers' databases with a record for
every possible IP address.  Fortunately, I found the following Lua script:

https://github.com/terbolous/powerdns-cloudstack-proxy-dns

I can confirm the Lua script works as expected and my CloudStack server can
be tricked into believing my internal DNS servers are the authority for
realhostip.com:

[root@virthost1 ]# dig +short 1-2-3-4.realhostip.com
1.2.3.4

I followed this guide and updated the console proxy/SSVM SSL certificate
with my own *.realhostip.com certificate.

http://docs.cloudstack.apache.org/projects/cloudstack-administration/en/latest/systemvm.html#changing-the-console-proxy-ssl-certificate-and-domain

The console proxy restarted but it's still blank when I try to view the
console.  Does the domain have to be something other than realhostip.com?


Re: new installation--ssvm won't start

2014-05-15 Thread Ian Young
I know this has something to do with idmapd and NFS.  This error keeps
appearing in /var/log/messages:
May  8 10:29:54 virthost1 rpc.idmapd[11044]: nss_getpwnam: name '0' does
not map into domain 'redacted.com'


On Thu, May 8, 2014 at 5:20 PM, Ian Young  wrote:

> I wiped the server clean and started over again today.  In the process, I
> realized that, the previous time, I forgot to uncomment the Domain line in
> /etc/idmapd.conf.  However, even though I included the step this time, the
> GUI installer still seems to hang on the final "Creating system VMs" step.
>  I see two VMs running when I run "virsh list" (the secondary storage VM
> keeps getting regenerated).  In the primary storage, it looks like there is
> one complete 693 MB image but the other two are only 11 and 12 MB, although
> they are gradually growing.  What's happening here?
>
> [root@virthost1 ~]# ls -hl /var/primary/
> total 715M
> -rwxr--r--. 1 nobody nobody  11M May  8 09:55
> 54de167f-ad9c-453b-91c7-fdd644922932
> -rwxr--r--. 1 nobody nobody  12M May  8 09:55
> 91069b66-b1b3-41aa-8995-874fd4353473
> -rwxr--r--. 1 nobody nobody 693M May  8 09:16
> c2e6efba-d6c7-11e3-9e76-002590c96d30
>
> The management server log keeps reporting that "There is no secondary
> storage VM for secondary storage host nfs://192.168.100.6/var/secondary."
>  Here is a larger section of logs:
> http://pastebin.com/NFf5cBx3
>
>
> On Wed, May 7, 2014 at 10:49 AM, Ian Young  wrote:
>
>> I noticed that in Home > Infrastructure > Zones > Zone1, Resources tab,
>> the Secondary Storage says "Allocated 0.00 KB / 0.00 KB".  However, the
>> secondary storage NFS mount is listed in Home > Infrastructure > Secondary
>> Storage and the URL is correct.  Does this mean the secondary storage is
>> unreachable?
>>
>>
>> On Wed, May 7, 2014 at 10:26 AM, Ian Young wrote:
>>
>>> I reinstalled my single server CloudStack system yesterday, following
>>> the quick start guide precisely.  The only difference was that I used
>>> /var/primary and /var/secondary instead of /primary and /secondary, because
>>> the /var partition on this machine is very large.  The UI installer reached
>>> the point where it says "Creating system VMs (this may take a while)" but
>>> never finished.  I left it overnight and it still hadn't completed.  This
>>> is typically the step that fails, most of the times I've installed
>>> CloudStack, so I imagine I must be making the same fundamental mistake each
>>> time, and I'd like to know what that is.
>>>
>>> I checked management.log and it's in a loop where it creates a secondary
>>> storage VM, fails to start it, destroys it, and tries again.  It says Host
>>> 1 is unreachable but I'm using the correct password, SELinux is permissive,
>>> and all the iptables rules are in place.  In what way is it trying to
>>> connect to Host 1?  SSH?  NFS?  Here's a log excerpt of messages related to
>>> the SSVM:
>>>
>>> http://pastebin.com/X11A51bh
>>>
>>> NFS appears to be functional, since CloudStack automatically mounted the
>>> primary storage.
>>>
>>> FilesystemSize  Used Avail Use% Mounted on
>>> /dev/sda3  20G  1.8G   17G  10% /
>>> tmpfs  32G 0   32G   0% /dev/shm
>>> /dev/sda1 194M   42M  143M  23% /boot
>>> /dev/sda4 1.8T  1.9G  1.7T   1% /var
>>> 192.168.100.6:/var/primary
>>>   1.8T  1.9G  1.7T   1%
>>> /mnt/0594caa2-ceb4-36c6-9b13-0ff149a130af
>>>
>>> How can I identify whatever it is that's preventing the SSVM from
>>> starting?  Here is another log excerpt, without any filtering:
>>>
>>> http://pastebin.com/XsPGJQik
>>>
>>
>>
>


Re: new installation--ssvm won't start

2014-05-15 Thread Ian Young
I noticed that in Home > Infrastructure > Zones > Zone1, Resources tab, the
Secondary Storage says "Allocated 0.00 KB / 0.00 KB".  However, the
secondary storage NFS mount is listed in Home > Infrastructure > Secondary
Storage and the URL is correct.  Does this mean the secondary storage is
unreachable?


On Wed, May 7, 2014 at 10:26 AM, Ian Young  wrote:

> I reinstalled my single server CloudStack system yesterday, following the
> quick start guide precisely.  The only difference was that I used
> /var/primary and /var/secondary instead of /primary and /secondary, because
> the /var partition on this machine is very large.  The UI installer reached
> the point where it says "Creating system VMs (this may take a while)" but
> never finished.  I left it overnight and it still hadn't completed.  This
> is typically the step that fails, most of the times I've installed
> CloudStack, so I imagine I must be making the same fundamental mistake each
> time, and I'd like to know what that is.
>
> I checked management.log and it's in a loop where it creates a secondary
> storage VM, fails to start it, destroys it, and tries again.  It says Host
> 1 is unreachable but I'm using the correct password, SELinux is permissive,
> and all the iptables rules are in place.  In what way is it trying to
> connect to Host 1?  SSH?  NFS?  Here's a log excerpt of messages related to
> the SSVM:
>
> http://pastebin.com/X11A51bh
>
> NFS appears to be functional, since CloudStack automatically mounted the
> primary storage.
>
> FilesystemSize  Used Avail Use% Mounted on
> /dev/sda3  20G  1.8G   17G  10% /
> tmpfs  32G 0   32G   0% /dev/shm
> /dev/sda1 194M   42M  143M  23% /boot
> /dev/sda4 1.8T  1.9G  1.7T   1% /var
> 192.168.100.6:/var/primary
>   1.8T  1.9G  1.7T   1%
> /mnt/0594caa2-ceb4-36c6-9b13-0ff149a130af
>
> How can I identify whatever it is that's preventing the SSVM from
> starting?  Here is another log excerpt, without any filtering:
>
> http://pastebin.com/XsPGJQik
>


Re: new installation--ssvm won't start

2014-05-14 Thread Ian Young
I'm using 4.3.  The Quick Installation Guide for CentOS (which is what I
was following) still has the old URL.  I forgot to mention changing the URL
was another thing I did differently in order to get it working.

http://docs.cloudstack.apache.org/projects/cloudstack-installation/en/latest/qig.html


On Tue, May 13, 2014 at 2:01 AM, sebgoa  wrote:

>
> On May 13, 2014, at 10:02 AM, Geoff Higginbottom <
> geoff.higginbot...@shapeblue.com> wrote:
>
> > Just for the record, the latest install doc does have the correct URLs
> for the System VM Templates
> >
> >
> http://docs.cloudstack.apache.org/projects/cloudstack-installation/en/master/installation.html
> >
>
> Yep, I just checked the master and the 4.3 version and the url seem
> correct:
>
>
> http://docs.cloudstack.apache.org/projects/cloudstack-installation/en/master/installation.html#prepare-the-system-vm-template
>
> if it' snot let me know or submit a patch
>
> >
> > Regards
> >
> > Geoff Higginbottom
> >
> > D: +44 20 3603 0542 | S: +44 20 3603 0540 | M: +447968161581
> >
> > geoff.higginbot...@shapeblue.com
> >
> > -Original Message-
> > From: dimas yoga pratama [mailto:smid...@gmail.com]
> > Sent: 12 May 2014 17:44
> > To: users@cloudstack.apache.org
> > Subject: Re: new installation--ssvm won't start
> >
> > Which version of Cloudstack you installd? If you follow the Cloudstack
> 4.3 installation guide there is a mistake in system template setup section,
> >
> > you should change the old URL with:
> >
> http://download.cloud.com/templates/4.3/systemvm64template-2014-01-14-master-kvm.qcow2.bz2
> >
> > hope it works.
> >
> > On Fri, May 9, 2014 at 7:20 AM, Ian Young 
> wrote:
> >
> >> I wiped the server clean and started over again today.  In the
> >> process, I realized that, the previous time, I forgot to uncomment the
> >> Domain line in /etc/idmapd.conf.  However, even though I included the
> >> step this time, the GUI installer still seems to hang on the final
> "Creating system VMs" step.
> >> I see two VMs running when I run "virsh list" (the secondary storage
> >> VM keeps getting regenerated).  In the primary storage, it looks like
> >> there is one complete 693 MB image but the other two are only 11 and
> >> 12 MB, although they are gradually growing.  What's happening here?
> >>
> >> [root@virthost1 ~]# ls -hl /var/primary/ total 715M -rwxr--r--. 1
> >> nobody nobody  11M May  8 09:55
> >> 54de167f-ad9c-453b-91c7-fdd644922932
> >> -rwxr--r--. 1 nobody nobody  12M May  8 09:55
> >> 91069b66-b1b3-41aa-8995-874fd4353473
> >> -rwxr--r--. 1 nobody nobody 693M May  8 09:16
> >> c2e6efba-d6c7-11e3-9e76-002590c96d30
> >>
> >> The management server log keeps reporting that "There is no secondary
> >> storage VM for secondary storage host nfs://192.168.100.6/var/secondary
> ."
> >> Here is a larger section of logs:
> >> http://pastebin.com/NFf5cBx3
> >>
> >>
> >> On Wed, May 7, 2014 at 10:49 AM, Ian Young 
> wrote:
> >>
> >>> I noticed that in Home > Infrastructure > Zones > Zone1, Resources
> >>> tab, the Secondary Storage says "Allocated 0.00 KB / 0.00 KB".
> >>> However, the secondary storage NFS mount is listed in Home >
> >>> Infrastructure >
> >> Secondary
> >>> Storage and the URL is correct.  Does this mean the secondary
> >>> storage is unreachable?
> >>>
> >>>
> >>> On Wed, May 7, 2014 at 10:26 AM, Ian Young 
> >> wrote:
> >>>
> >>>> I reinstalled my single server CloudStack system yesterday,
> >>>> following
> >> the
> >>>> quick start guide precisely.  The only difference was that I used
> >>>> /var/primary and /var/secondary instead of /primary and /secondary,
> >> because
> >>>> the /var partition on this machine is very large.  The UI installer
> >> reached
> >>>> the point where it says "Creating system VMs (this may take a while)"
> >> but
> >>>> never finished.  I left it overnight and it still hadn't completed.
> >> This
> >>>> is typically the step that fails, most of the times I've installed
> >>>> CloudStack, so I imagine I must be making the same fundamental
> >>>> mistake
> >> each
> >>>> time, and I'd like to

Re: new installation--ssvm won't start

2014-05-13 Thread Ian Young
Exactly.  That's the URL I was referring to.  I changed it to the
2014-01-14 template and it worked.


On Tue, May 13, 2014 at 9:52 AM, dimas yoga pratama wrote:

> oh okay, I should have read that part as well.
> What I mean is this :
>
> http://docs.cloudstack.apache.org/projects/cloudstack-installation/en/master/qig.html#system-template-setup
>
>
>
> On Tue, May 13, 2014 at 4:01 PM, sebgoa  wrote:
>
> >
> > On May 13, 2014, at 10:02 AM, Geoff Higginbottom <
> > geoff.higginbot...@shapeblue.com> wrote:
> >
> > > Just for the record, the latest install doc does have the correct URLs
> > for the System VM Templates
> > >
> > >
> >
> http://docs.cloudstack.apache.org/projects/cloudstack-installation/en/master/installation.html
> > >
> >
> > Yep, I just checked the master and the 4.3 version and the url seem
> > correct:
> >
> >
> >
> http://docs.cloudstack.apache.org/projects/cloudstack-installation/en/master/installation.html#prepare-the-system-vm-template
> >
> > if it' snot let me know or submit a patch
> >
> > >
> > > Regards
> > >
> > > Geoff Higginbottom
> > >
> > > D: +44 20 3603 0542 | S: +44 20 3603 0540 | M: +447968161581
> > >
> > > geoff.higginbot...@shapeblue.com
> > >
> > > -Original Message-
> > > From: dimas yoga pratama [mailto:smid...@gmail.com]
> > > Sent: 12 May 2014 17:44
> > > To: users@cloudstack.apache.org
> > > Subject: Re: new installation--ssvm won't start
> > >
> > > Which version of Cloudstack you installd? If you follow the Cloudstack
> > 4.3 installation guide there is a mistake in system template setup
> section,
> > >
> > > you should change the old URL with:
> > >
> >
> http://download.cloud.com/templates/4.3/systemvm64template-2014-01-14-master-kvm.qcow2.bz2
> > >
> > > hope it works.
> > >
> > > On Fri, May 9, 2014 at 7:20 AM, Ian Young 
> > wrote:
> > >
> > >> I wiped the server clean and started over again today.  In the
> > >> process, I realized that, the previous time, I forgot to uncomment the
> > >> Domain line in /etc/idmapd.conf.  However, even though I included the
> > >> step this time, the GUI installer still seems to hang on the final
> > "Creating system VMs" step.
> > >> I see two VMs running when I run "virsh list" (the secondary storage
> > >> VM keeps getting regenerated).  In the primary storage, it looks like
> > >> there is one complete 693 MB image but the other two are only 11 and
> > >> 12 MB, although they are gradually growing.  What's happening here?
> > >>
> > >> [root@virthost1 ~]# ls -hl /var/primary/ total 715M -rwxr--r--. 1
> > >> nobody nobody  11M May  8 09:55
> > >> 54de167f-ad9c-453b-91c7-fdd644922932
> > >> -rwxr--r--. 1 nobody nobody  12M May  8 09:55
> > >> 91069b66-b1b3-41aa-8995-874fd4353473
> > >> -rwxr--r--. 1 nobody nobody 693M May  8 09:16
> > >> c2e6efba-d6c7-11e3-9e76-002590c96d30
> > >>
> > >> The management server log keeps reporting that "There is no secondary
> > >> storage VM for secondary storage host nfs://
> 192.168.100.6/var/secondary
> > ."
> > >> Here is a larger section of logs:
> > >> http://pastebin.com/NFf5cBx3
> > >>
> > >>
> > >> On Wed, May 7, 2014 at 10:49 AM, Ian Young 
> > wrote:
> > >>
> > >>> I noticed that in Home > Infrastructure > Zones > Zone1, Resources
> > >>> tab, the Secondary Storage says "Allocated 0.00 KB / 0.00 KB".
> > >>> However, the secondary storage NFS mount is listed in Home >
> > >>> Infrastructure >
> > >> Secondary
> > >>> Storage and the URL is correct.  Does this mean the secondary
> > >>> storage is unreachable?
> > >>>
> > >>>
> > >>> On Wed, May 7, 2014 at 10:26 AM, Ian Young 
> > >> wrote:
> > >>>
> > >>>> I reinstalled my single server CloudStack system yesterday,
> > >>>> following
> > >> the
> > >>>> quick start guide precisely.  The only difference was that I used
> > >>>> /var/primary and /var/secondary instead of /primary and /secondary,
> > >> because
> > >>>> the /var partition on this ma

Re: new installation--ssvm won't start

2014-05-12 Thread Ian Young
I was able to complete the installation on Friday.  Two things I did
differently that were not mentioned in the quick start guide were to
disable requiretty in /etc/sudoers and to set up NFSv4 correctly (i.e. set
up a global root directory with fsid=0).  I'm not sure how much impact the
sudoers configuration had on my problem but I'm pretty sure the NFS setup
was the main issue.


On Thu, May 8, 2014 at 5:33 PM, Ian Young  wrote:

> I know this has something to do with idmapd and NFS.  This error keeps
> appearing in /var/log/messages:
> May  8 10:29:54 virthost1 rpc.idmapd[11044]: nss_getpwnam: name '0' does
> not map into domain 'redacted.com'
>
>
> On Thu, May 8, 2014 at 5:20 PM, Ian Young  wrote:
>
>> I wiped the server clean and started over again today.  In the process, I
>> realized that, the previous time, I forgot to uncomment the Domain line in
>> /etc/idmapd.conf.  However, even though I included the step this time, the
>> GUI installer still seems to hang on the final "Creating system VMs" step.
>>  I see two VMs running when I run "virsh list" (the secondary storage VM
>> keeps getting regenerated).  In the primary storage, it looks like there is
>> one complete 693 MB image but the other two are only 11 and 12 MB, although
>> they are gradually growing.  What's happening here?
>>
>> [root@virthost1 ~]# ls -hl /var/primary/
>> total 715M
>> -rwxr--r--. 1 nobody nobody  11M May  8 09:55
>> 54de167f-ad9c-453b-91c7-fdd644922932
>> -rwxr--r--. 1 nobody nobody  12M May  8 09:55
>> 91069b66-b1b3-41aa-8995-874fd4353473
>> -rwxr--r--. 1 nobody nobody 693M May  8 09:16
>> c2e6efba-d6c7-11e3-9e76-002590c96d30
>>
>> The management server log keeps reporting that "There is no secondary
>> storage VM for secondary storage host nfs://192.168.100.6/var/secondary."
>>  Here is a larger section of logs:
>> http://pastebin.com/NFf5cBx3
>>
>>
>> On Wed, May 7, 2014 at 10:49 AM, Ian Young wrote:
>>
>>> I noticed that in Home > Infrastructure > Zones > Zone1, Resources tab,
>>> the Secondary Storage says "Allocated 0.00 KB / 0.00 KB".  However, the
>>> secondary storage NFS mount is listed in Home > Infrastructure > Secondary
>>> Storage and the URL is correct.  Does this mean the secondary storage is
>>> unreachable?
>>>
>>>
>>> On Wed, May 7, 2014 at 10:26 AM, Ian Young wrote:
>>>
>>>> I reinstalled my single server CloudStack system yesterday, following
>>>> the quick start guide precisely.  The only difference was that I used
>>>> /var/primary and /var/secondary instead of /primary and /secondary, because
>>>> the /var partition on this machine is very large.  The UI installer reached
>>>> the point where it says "Creating system VMs (this may take a while)" but
>>>> never finished.  I left it overnight and it still hadn't completed.  This
>>>> is typically the step that fails, most of the times I've installed
>>>> CloudStack, so I imagine I must be making the same fundamental mistake each
>>>> time, and I'd like to know what that is.
>>>>
>>>> I checked management.log and it's in a loop where it creates a
>>>> secondary storage VM, fails to start it, destroys it, and tries again.  It
>>>> says Host 1 is unreachable but I'm using the correct password, SELinux is
>>>> permissive, and all the iptables rules are in place.  In what way is it
>>>> trying to connect to Host 1?  SSH?  NFS?  Here's a log excerpt of messages
>>>> related to the SSVM:
>>>>
>>>> http://pastebin.com/X11A51bh
>>>>
>>>> NFS appears to be functional, since CloudStack automatically mounted
>>>> the primary storage.
>>>>
>>>> FilesystemSize  Used Avail Use% Mounted on
>>>> /dev/sda3  20G  1.8G   17G  10% /
>>>> tmpfs  32G 0   32G   0% /dev/shm
>>>> /dev/sda1 194M   42M  143M  23% /boot
>>>> /dev/sda4 1.8T  1.9G  1.7T   1% /var
>>>> 192.168.100.6:/var/primary
>>>>   1.8T  1.9G  1.7T   1%
>>>> /mnt/0594caa2-ceb4-36c6-9b13-0ff149a130af
>>>>
>>>> How can I identify whatever it is that's preventing the SSVM from
>>>> starting?  Here is another log excerpt, without any filtering:
>>>>
>>>> http://pastebin.com/XsPGJQik
>>>>
>>>
>>>
>>
>


new installation--ssvm won't start

2014-05-12 Thread Ian Young
I reinstalled my single server CloudStack system yesterday, following the
quick start guide precisely.  The only difference was that I used
/var/primary and /var/secondary instead of /primary and /secondary, because
the /var partition on this machine is very large.  The UI installer reached
the point where it says "Creating system VMs (this may take a while)" but
never finished.  I left it overnight and it still hadn't completed.  This
is typically the step that fails, most of the times I've installed
CloudStack, so I imagine I must be making the same fundamental mistake each
time, and I'd like to know what that is.

I checked management.log and it's in a loop where it creates a secondary
storage VM, fails to start it, destroys it, and tries again.  It says Host
1 is unreachable but I'm using the correct password, SELinux is permissive,
and all the iptables rules are in place.  In what way is it trying to
connect to Host 1?  SSH?  NFS?  Here's a log excerpt of messages related to
the SSVM:

http://pastebin.com/X11A51bh

NFS appears to be functional, since CloudStack automatically mounted the
primary storage.

FilesystemSize  Used Avail Use% Mounted on
/dev/sda3  20G  1.8G   17G  10% /
tmpfs  32G 0   32G   0% /dev/shm
/dev/sda1 194M   42M  143M  23% /boot
/dev/sda4 1.8T  1.9G  1.7T   1% /var
192.168.100.6:/var/primary
  1.8T  1.9G  1.7T   1%
/mnt/0594caa2-ceb4-36c6-9b13-0ff149a130af

How can I identify whatever it is that's preventing the SSVM from starting?
 Here is another log excerpt, without any filtering:

http://pastebin.com/XsPGJQik


Re: basic networking, single server

2014-05-05 Thread Ian Young
I forgot to mention this system is entirely for internal purposes.  We
don't need a public network.


On Mon, May 5, 2014 at 3:39 PM, Ian Young  wrote:

> I'm reinstalling CloudStack on a single server with lots of RAM, CPU
> cores, and storage.  I also have a single 192.168.100.0/24 private
> network, which was set up before I was hired and can't be easily
> reconfigured due to the high number of employee workstations currently
> connected to it and occupying IP addresses across this range.  I see that
> the CloudStack documentation strongly recommends separate NICs for
> management traffic and guest traffic.  This server does have two NICs, so
> what would be the ideal way to configure the network?  Another switch with
> a different subnet for the management network?  What about the storage
> network?
>


basic networking, single server

2014-05-05 Thread Ian Young
I'm reinstalling CloudStack on a single server with lots of RAM, CPU cores,
and storage.  I also have a single 192.168.100.0/24 private network, which
was set up before I was hired and can't be easily reconfigured due to the
high number of employee workstations currently connected to it and
occupying IP addresses across this range.  I see that the CloudStack
documentation strongly recommends separate NICs for management traffic and
guest traffic.  This server does have two NICs, so what would be the ideal
way to configure the network?  Another switch with a different subnet for
the management network?  What about the storage network?


Re: failed to start virtual router

2014-05-01 Thread Ian Young
I downgraded to 4.2.1, restored the database backup I made during the
initial upgrade to 4.3, copied the rpmsave files over
/etc/cloudstack/agent/agent.properties and
/etc/cloudstack/managment/db.properties, and started the agent and
management services.  The service_ip field in the mshost table changed to
127.0.0.1 again, so I changed it to the actual IP address of the server.
 Then I ran

cloud-install-sys-tmplt -m /var/storage/secondary -u
http://download.cloud.com/templates/4.2/systemvmtemplate-2013-06-12-master-kvm.qcow2.bz2-h
kvm -F

which said it successfully installed the system VM template to
/var/storage/secondary/template/tmpl/1/3/.  Next, I restarted the
management service.  When I logged in, the management console said there
was an SSVM, console proxy, and router present but they were all stopped.
 I tried starting the SSVM but it failed to start.  Here is an excerpt from
management.log:

http://pastebin.com/uW4NPZbC

Is there some sort of general troubleshooting script that could identify
major configuration problems?  I feel like this system is mostly functional
but one or two misconfigured things are causing it to break.  I'd hate to
have to wipe it clean and start over, losing all my VMs in the process.


On Wed, Apr 30, 2014 at 6:00 PM, Ian Young  wrote:

> I read this article about upgrading the system VMs:
>
>
> https://cwiki.apache.org/confluence/display/CLOUDSTACK/CloudStack+4.2+(KVM)+System+Vm+Upgrade
>
> However, there's just an empty set in the template_host_ref table.  Is
> this no longer used in 4.3?
>
>
> On Wed, Apr 30, 2014 at 5:41 PM, Ian Young  wrote:
>
>> I've tried upgrading to 4.3 again.  After poking around some more in the
>> database, I've discovered that the KVM system VM template was only 27%
>> downloaded.  I think this is why the virtual router was unable to
>> start--the template was incomplete.  Is there a way to force it to resume
>> downloading?
>>
>>
>> On Wed, Apr 30, 2014 at 3:03 PM, Ian Young wrote:
>>
>>> The address in Infrastructure > Hosts > (management server) is set to
>>> the correct IP address, not 127.0.0.1.  Why are the logs referring to
>>> 127.0.0.1?
>>>
>>>
>>> On Wed, Apr 30, 2014 at 3:00 PM, Ian Young wrote:
>>>
>>>> I notice my dashboard says "Management server node 127.0.0.1 is up."
>>>>  It used to have an actual address, not localhost.  Could this be causing
>>>> problems and if so, how can I set it back?
>>>>
>>>>
>>>> On Wed, Apr 30, 2014 at 12:40 PM, Ian Young wrote:
>>>>
>>>>> Yes, I replaced the new files with the rpmsave ones, which allowed the
>>>>> agent to start.  However, most of the functions in the management console
>>>>> fail.
>>>>>
>>>>>
>>>>> On Wed, Apr 30, 2014 at 12:34 PM, stevenliang 
>>>>> wrote:
>>>>>
>>>>>> Do you have the file db.properties.rpmsave on management server and
>>>>>> agent.properties.rpmsave on agents? If so, and the date is correct, you 
>>>>>> can
>>>>>> use it rather than db.properties and agent.properties.
>>>>>> And then restart management and agent services.
>>>>>>
>>>>>>
>>>>>> On 30/04/14 03:26 PM, Ian Young wrote:
>>>>>>
>>>>>>> Yes, I restored the DB from the backup.  When I try to start the
>>>>>>> router it
>>>>>>> says:
>>>>>>>
>>>>>>> Resource [Host:1] is unreachable: Host 1: Unable to start instance
>>>>>>> due to
>>>>>>> Unable to start VM[DomainRouter|r-63-VM] due to error in
>>>>>>> finalizeStart, not
>>>>>>> retrying
>>>>>>>
>>>>>>> The management server log says:
>>>>>>>
>>>>>>> 2014-04-30 12:20:52,485 ERROR [cloud.async.AsyncJobManagerImpl]
>>>>>>> (Job-Executor-7:job-520 = [ 9d7c898c-b5d0-4bd0-a711-563a91d7acc9 ])
>>>>>>> Unexpected exception while executing
>>>>>>> org.apache.cloudstack.api.command.admin.router.StartRouterCmd
>>>>>>> com.cloud.exception.AgentUnavailableException: Resource [Host:1] is
>>>>>>> unreachable: Host 1: Unable to start instance due to Unable to start
>>>>>>> VM[DomainRouter|r-63-VM] due to error in finalizeStart, not retrying
>>>>>>>
>>>>>>>
>>>

management server IP address

2014-05-01 Thread Ian Young
My management server IP address has always been 192.168.100.6.  Ever since
I upgraded to 4.3, it's been set to 127.0.0.1 (mshost.service_ip in the
database).  Where is this value being set and how can I change it back to
the original IP address?


Re: failed to start virtual router

2014-04-30 Thread Ian Young
I read this article about upgrading the system VMs:

https://cwiki.apache.org/confluence/display/CLOUDSTACK/CloudStack+4.2+(KVM)+System+Vm+Upgrade

However, there's just an empty set in the template_host_ref table.  Is this
no longer used in 4.3?


On Wed, Apr 30, 2014 at 5:41 PM, Ian Young  wrote:

> I've tried upgrading to 4.3 again.  After poking around some more in the
> database, I've discovered that the KVM system VM template was only 27%
> downloaded.  I think this is why the virtual router was unable to
> start--the template was incomplete.  Is there a way to force it to resume
> downloading?
>
>
> On Wed, Apr 30, 2014 at 3:03 PM, Ian Young  wrote:
>
>> The address in Infrastructure > Hosts > (management server) is set to the
>> correct IP address, not 127.0.0.1.  Why are the logs referring to 127.0.0.1?
>>
>>
>> On Wed, Apr 30, 2014 at 3:00 PM, Ian Young wrote:
>>
>>> I notice my dashboard says "Management server node 127.0.0.1 is up."  It
>>> used to have an actual address, not localhost.  Could this be causing
>>> problems and if so, how can I set it back?
>>>
>>>
>>> On Wed, Apr 30, 2014 at 12:40 PM, Ian Young wrote:
>>>
>>>> Yes, I replaced the new files with the rpmsave ones, which allowed the
>>>> agent to start.  However, most of the functions in the management console
>>>> fail.
>>>>
>>>>
>>>> On Wed, Apr 30, 2014 at 12:34 PM, stevenliang wrote:
>>>>
>>>>> Do you have the file db.properties.rpmsave on management server and
>>>>> agent.properties.rpmsave on agents? If so, and the date is correct, you 
>>>>> can
>>>>> use it rather than db.properties and agent.properties.
>>>>> And then restart management and agent services.
>>>>>
>>>>>
>>>>> On 30/04/14 03:26 PM, Ian Young wrote:
>>>>>
>>>>>> Yes, I restored the DB from the backup.  When I try to start the
>>>>>> router it
>>>>>> says:
>>>>>>
>>>>>> Resource [Host:1] is unreachable: Host 1: Unable to start instance
>>>>>> due to
>>>>>> Unable to start VM[DomainRouter|r-63-VM] due to error in
>>>>>> finalizeStart, not
>>>>>> retrying
>>>>>>
>>>>>> The management server log says:
>>>>>>
>>>>>> 2014-04-30 12:20:52,485 ERROR [cloud.async.AsyncJobManagerImpl]
>>>>>> (Job-Executor-7:job-520 = [ 9d7c898c-b5d0-4bd0-a711-563a91d7acc9 ])
>>>>>> Unexpected exception while executing
>>>>>> org.apache.cloudstack.api.command.admin.router.StartRouterCmd
>>>>>> com.cloud.exception.AgentUnavailableException: Resource [Host:1] is
>>>>>> unreachable: Host 1: Unable to start instance due to Unable to start
>>>>>> VM[DomainRouter|r-63-VM] due to error in finalizeStart, not retrying
>>>>>>
>>>>>>
>>>>>> On Wed, Apr 30, 2014 at 12:02 PM, stevenliang 
>>>>>> wrote:
>>>>>>
>>>>>>  I think you had backed up database, when you upgraded.
>>>>>>> When you downgraded CS, you also need to restore DB.
>>>>>>>
>>>>>>>
>>>>>>> On 30/04/14 02:58 PM, Ian Young wrote:
>>>>>>>
>>>>>>>  I think my problem stems from a partially downloaded system VM
>>>>>>>> template.
>>>>>>>>   I
>>>>>>>> just noticed systemvm-kvm-4.3 is stuck at 27% downloaded.  It must
>>>>>>>> have
>>>>>>>> been interrupted during the upgrade to 4.3.  At the moment I've
>>>>>>>> rolled
>>>>>>>> back
>>>>>>>> to 4.2.1 with a somewhat usable management interface, although the
>>>>>>>> system
>>>>>>>> VMs won't start.  I suspect there is something in the database that
>>>>>>>> is
>>>>>>>> causing it to try to use the 4.3 template.  How can I delete the
>>>>>>>> template
>>>>>>>> and make sure the management server is using the older one?
>>>>>>>>
>>>>>>>>
>>>>>>>> On Tue, Apr 29, 2014 at 8:23 PM, Ian Young 
>>>>>>>> wrote:
>>>>&g

Re: failed to start virtual router

2014-04-30 Thread Ian Young
I've tried upgrading to 4.3 again.  After poking around some more in the
database, I've discovered that the KVM system VM template was only 27%
downloaded.  I think this is why the virtual router was unable to
start--the template was incomplete.  Is there a way to force it to resume
downloading?


On Wed, Apr 30, 2014 at 3:03 PM, Ian Young  wrote:

> The address in Infrastructure > Hosts > (management server) is set to the
> correct IP address, not 127.0.0.1.  Why are the logs referring to 127.0.0.1?
>
>
> On Wed, Apr 30, 2014 at 3:00 PM, Ian Young  wrote:
>
>> I notice my dashboard says "Management server node 127.0.0.1 is up."  It
>> used to have an actual address, not localhost.  Could this be causing
>> problems and if so, how can I set it back?
>>
>>
>> On Wed, Apr 30, 2014 at 12:40 PM, Ian Young wrote:
>>
>>> Yes, I replaced the new files with the rpmsave ones, which allowed the
>>> agent to start.  However, most of the functions in the management console
>>> fail.
>>>
>>>
>>> On Wed, Apr 30, 2014 at 12:34 PM, stevenliang wrote:
>>>
>>>> Do you have the file db.properties.rpmsave on management server and
>>>> agent.properties.rpmsave on agents? If so, and the date is correct, you can
>>>> use it rather than db.properties and agent.properties.
>>>> And then restart management and agent services.
>>>>
>>>>
>>>> On 30/04/14 03:26 PM, Ian Young wrote:
>>>>
>>>>> Yes, I restored the DB from the backup.  When I try to start the
>>>>> router it
>>>>> says:
>>>>>
>>>>> Resource [Host:1] is unreachable: Host 1: Unable to start instance due
>>>>> to
>>>>> Unable to start VM[DomainRouter|r-63-VM] due to error in
>>>>> finalizeStart, not
>>>>> retrying
>>>>>
>>>>> The management server log says:
>>>>>
>>>>> 2014-04-30 12:20:52,485 ERROR [cloud.async.AsyncJobManagerImpl]
>>>>> (Job-Executor-7:job-520 = [ 9d7c898c-b5d0-4bd0-a711-563a91d7acc9 ])
>>>>> Unexpected exception while executing
>>>>> org.apache.cloudstack.api.command.admin.router.StartRouterCmd
>>>>> com.cloud.exception.AgentUnavailableException: Resource [Host:1] is
>>>>> unreachable: Host 1: Unable to start instance due to Unable to start
>>>>> VM[DomainRouter|r-63-VM] due to error in finalizeStart, not retrying
>>>>>
>>>>>
>>>>> On Wed, Apr 30, 2014 at 12:02 PM, stevenliang 
>>>>> wrote:
>>>>>
>>>>>  I think you had backed up database, when you upgraded.
>>>>>> When you downgraded CS, you also need to restore DB.
>>>>>>
>>>>>>
>>>>>> On 30/04/14 02:58 PM, Ian Young wrote:
>>>>>>
>>>>>>  I think my problem stems from a partially downloaded system VM
>>>>>>> template.
>>>>>>>   I
>>>>>>> just noticed systemvm-kvm-4.3 is stuck at 27% downloaded.  It must
>>>>>>> have
>>>>>>> been interrupted during the upgrade to 4.3.  At the moment I've
>>>>>>> rolled
>>>>>>> back
>>>>>>> to 4.2.1 with a somewhat usable management interface, although the
>>>>>>> system
>>>>>>> VMs won't start.  I suspect there is something in the database that
>>>>>>> is
>>>>>>> causing it to try to use the 4.3 template.  How can I delete the
>>>>>>> template
>>>>>>> and make sure the management server is using the older one?
>>>>>>>
>>>>>>>
>>>>>>> On Tue, Apr 29, 2014 at 8:23 PM, Ian Young 
>>>>>>> wrote:
>>>>>>>
>>>>>>>   Ok, so I've figured out a way to identify volumes in the
>>>>>>> filesystem.  For
>>>>>>>
>>>>>>>> instance, /var/storage/primary/4d324e1a-e3a6-4da8-9c4d-44ad723482ad
>>>>>>>> is
>>>>>>>> the
>>>>>>>> root volume for an instance I want to back up.  Is this in qcow2
>>>>>>>> format
>>>>>>>> or
>>>>>>>> something else?  I'm using KVM.
>>>>>>>>
>>>>>>>>
>>>>&

Re: failed to start virtual router

2014-04-30 Thread Ian Young
I notice my dashboard says "Management server node 127.0.0.1 is up."  It
used to have an actual address, not localhost.  Could this be causing
problems and if so, how can I set it back?


On Wed, Apr 30, 2014 at 12:40 PM, Ian Young  wrote:

> Yes, I replaced the new files with the rpmsave ones, which allowed the
> agent to start.  However, most of the functions in the management console
> fail.
>
>
> On Wed, Apr 30, 2014 at 12:34 PM, stevenliang wrote:
>
>> Do you have the file db.properties.rpmsave on management server and
>> agent.properties.rpmsave on agents? If so, and the date is correct, you can
>> use it rather than db.properties and agent.properties.
>> And then restart management and agent services.
>>
>>
>> On 30/04/14 03:26 PM, Ian Young wrote:
>>
>>> Yes, I restored the DB from the backup.  When I try to start the router
>>> it
>>> says:
>>>
>>> Resource [Host:1] is unreachable: Host 1: Unable to start instance due to
>>> Unable to start VM[DomainRouter|r-63-VM] due to error in finalizeStart,
>>> not
>>> retrying
>>>
>>> The management server log says:
>>>
>>> 2014-04-30 12:20:52,485 ERROR [cloud.async.AsyncJobManagerImpl]
>>> (Job-Executor-7:job-520 = [ 9d7c898c-b5d0-4bd0-a711-563a91d7acc9 ])
>>> Unexpected exception while executing
>>> org.apache.cloudstack.api.command.admin.router.StartRouterCmd
>>> com.cloud.exception.AgentUnavailableException: Resource [Host:1] is
>>> unreachable: Host 1: Unable to start instance due to Unable to start
>>> VM[DomainRouter|r-63-VM] due to error in finalizeStart, not retrying
>>>
>>>
>>> On Wed, Apr 30, 2014 at 12:02 PM, stevenliang 
>>> wrote:
>>>
>>>  I think you had backed up database, when you upgraded.
>>>> When you downgraded CS, you also need to restore DB.
>>>>
>>>>
>>>> On 30/04/14 02:58 PM, Ian Young wrote:
>>>>
>>>>  I think my problem stems from a partially downloaded system VM
>>>>> template.
>>>>>   I
>>>>> just noticed systemvm-kvm-4.3 is stuck at 27% downloaded.  It must have
>>>>> been interrupted during the upgrade to 4.3.  At the moment I've rolled
>>>>> back
>>>>> to 4.2.1 with a somewhat usable management interface, although the
>>>>> system
>>>>> VMs won't start.  I suspect there is something in the database that is
>>>>> causing it to try to use the 4.3 template.  How can I delete the
>>>>> template
>>>>> and make sure the management server is using the older one?
>>>>>
>>>>>
>>>>> On Tue, Apr 29, 2014 at 8:23 PM, Ian Young 
>>>>> wrote:
>>>>>
>>>>>   Ok, so I've figured out a way to identify volumes in the filesystem.
>>>>>  For
>>>>>
>>>>>> instance, /var/storage/primary/4d324e1a-e3a6-4da8-9c4d-44ad723482ad
>>>>>> is
>>>>>> the
>>>>>> root volume for an instance I want to back up.  Is this in qcow2
>>>>>> format
>>>>>> or
>>>>>> something else?  I'm using KVM.
>>>>>>
>>>>>>
>>>>>> On Tue, Apr 29, 2014 at 7:38 PM, Ian Young 
>>>>>> wrote:
>>>>>>
>>>>>>   Now I can't start cloudstack-agent.  The agent.log says:
>>>>>>
>>>>>>> Unable to start agent: Failed to get private nic name
>>>>>>>
>>>>>>> I know this is because the network bridge is no longer set up
>>>>>>> correctly.
>>>>>>>I used to have a cloud0 and a cloudbr0 interface.  Now I only have
>>>>>>> cloudbr0.  I haven't changed my network configuration.  Somehow it's
>>>>>>> been
>>>>>>> changed by CloudStack during the upgrade/downgrade.  This is getting
>>>>>>> worse
>>>>>>> and worse the more I try to recover my data.  Is there any way to
>>>>>>> back
>>>>>>> up
>>>>>>> the instances' volumes via the command line?  I can't tell which is
>>>>>>> which
>>>>>>> because the filenames are all hashes.  I really need to get these
>>>>>>> instances
>>>>>>> up and running--there ar

Re: failed to start virtual router

2014-04-30 Thread Ian Young
The address in Infrastructure > Hosts > (management server) is set to the
correct IP address, not 127.0.0.1.  Why are the logs referring to 127.0.0.1?


On Wed, Apr 30, 2014 at 3:00 PM, Ian Young  wrote:

> I notice my dashboard says "Management server node 127.0.0.1 is up."  It
> used to have an actual address, not localhost.  Could this be causing
> problems and if so, how can I set it back?
>
>
> On Wed, Apr 30, 2014 at 12:40 PM, Ian Young wrote:
>
>> Yes, I replaced the new files with the rpmsave ones, which allowed the
>> agent to start.  However, most of the functions in the management console
>> fail.
>>
>>
>> On Wed, Apr 30, 2014 at 12:34 PM, stevenliang wrote:
>>
>>> Do you have the file db.properties.rpmsave on management server and
>>> agent.properties.rpmsave on agents? If so, and the date is correct, you can
>>> use it rather than db.properties and agent.properties.
>>> And then restart management and agent services.
>>>
>>>
>>> On 30/04/14 03:26 PM, Ian Young wrote:
>>>
>>>> Yes, I restored the DB from the backup.  When I try to start the router
>>>> it
>>>> says:
>>>>
>>>> Resource [Host:1] is unreachable: Host 1: Unable to start instance due
>>>> to
>>>> Unable to start VM[DomainRouter|r-63-VM] due to error in finalizeStart,
>>>> not
>>>> retrying
>>>>
>>>> The management server log says:
>>>>
>>>> 2014-04-30 12:20:52,485 ERROR [cloud.async.AsyncJobManagerImpl]
>>>> (Job-Executor-7:job-520 = [ 9d7c898c-b5d0-4bd0-a711-563a91d7acc9 ])
>>>> Unexpected exception while executing
>>>> org.apache.cloudstack.api.command.admin.router.StartRouterCmd
>>>> com.cloud.exception.AgentUnavailableException: Resource [Host:1] is
>>>> unreachable: Host 1: Unable to start instance due to Unable to start
>>>> VM[DomainRouter|r-63-VM] due to error in finalizeStart, not retrying
>>>>
>>>>
>>>> On Wed, Apr 30, 2014 at 12:02 PM, stevenliang 
>>>> wrote:
>>>>
>>>>  I think you had backed up database, when you upgraded.
>>>>> When you downgraded CS, you also need to restore DB.
>>>>>
>>>>>
>>>>> On 30/04/14 02:58 PM, Ian Young wrote:
>>>>>
>>>>>  I think my problem stems from a partially downloaded system VM
>>>>>> template.
>>>>>>   I
>>>>>> just noticed systemvm-kvm-4.3 is stuck at 27% downloaded.  It must
>>>>>> have
>>>>>> been interrupted during the upgrade to 4.3.  At the moment I've rolled
>>>>>> back
>>>>>> to 4.2.1 with a somewhat usable management interface, although the
>>>>>> system
>>>>>> VMs won't start.  I suspect there is something in the database that is
>>>>>> causing it to try to use the 4.3 template.  How can I delete the
>>>>>> template
>>>>>> and make sure the management server is using the older one?
>>>>>>
>>>>>>
>>>>>> On Tue, Apr 29, 2014 at 8:23 PM, Ian Young 
>>>>>> wrote:
>>>>>>
>>>>>>   Ok, so I've figured out a way to identify volumes in the
>>>>>> filesystem.  For
>>>>>>
>>>>>>> instance, /var/storage/primary/4d324e1a-e3a6-4da8-9c4d-44ad723482ad
>>>>>>> is
>>>>>>> the
>>>>>>> root volume for an instance I want to back up.  Is this in qcow2
>>>>>>> format
>>>>>>> or
>>>>>>> something else?  I'm using KVM.
>>>>>>>
>>>>>>>
>>>>>>> On Tue, Apr 29, 2014 at 7:38 PM, Ian Young 
>>>>>>> wrote:
>>>>>>>
>>>>>>>   Now I can't start cloudstack-agent.  The agent.log says:
>>>>>>>
>>>>>>>> Unable to start agent: Failed to get private nic name
>>>>>>>>
>>>>>>>> I know this is because the network bridge is no longer set up
>>>>>>>> correctly.
>>>>>>>>I used to have a cloud0 and a cloudbr0 interface.  Now I only
>>>>>>>> have
>>>>>>>> cloudbr0.  I haven't changed my network configuration.  Somehow it's
>>>>>

Re: failed to start virtual router

2014-04-30 Thread Ian Young
Yes, I replaced the new files with the rpmsave ones, which allowed the
agent to start.  However, most of the functions in the management console
fail.


On Wed, Apr 30, 2014 at 12:34 PM, stevenliang  wrote:

> Do you have the file db.properties.rpmsave on management server and
> agent.properties.rpmsave on agents? If so, and the date is correct, you can
> use it rather than db.properties and agent.properties.
> And then restart management and agent services.
>
>
> On 30/04/14 03:26 PM, Ian Young wrote:
>
>> Yes, I restored the DB from the backup.  When I try to start the router it
>> says:
>>
>> Resource [Host:1] is unreachable: Host 1: Unable to start instance due to
>> Unable to start VM[DomainRouter|r-63-VM] due to error in finalizeStart,
>> not
>> retrying
>>
>> The management server log says:
>>
>> 2014-04-30 12:20:52,485 ERROR [cloud.async.AsyncJobManagerImpl]
>> (Job-Executor-7:job-520 = [ 9d7c898c-b5d0-4bd0-a711-563a91d7acc9 ])
>> Unexpected exception while executing
>> org.apache.cloudstack.api.command.admin.router.StartRouterCmd
>> com.cloud.exception.AgentUnavailableException: Resource [Host:1] is
>> unreachable: Host 1: Unable to start instance due to Unable to start
>> VM[DomainRouter|r-63-VM] due to error in finalizeStart, not retrying
>>
>>
>> On Wed, Apr 30, 2014 at 12:02 PM, stevenliang 
>> wrote:
>>
>>  I think you had backed up database, when you upgraded.
>>> When you downgraded CS, you also need to restore DB.
>>>
>>>
>>> On 30/04/14 02:58 PM, Ian Young wrote:
>>>
>>>  I think my problem stems from a partially downloaded system VM template.
>>>>   I
>>>> just noticed systemvm-kvm-4.3 is stuck at 27% downloaded.  It must have
>>>> been interrupted during the upgrade to 4.3.  At the moment I've rolled
>>>> back
>>>> to 4.2.1 with a somewhat usable management interface, although the
>>>> system
>>>> VMs won't start.  I suspect there is something in the database that is
>>>> causing it to try to use the 4.3 template.  How can I delete the
>>>> template
>>>> and make sure the management server is using the older one?
>>>>
>>>>
>>>> On Tue, Apr 29, 2014 at 8:23 PM, Ian Young 
>>>> wrote:
>>>>
>>>>   Ok, so I've figured out a way to identify volumes in the filesystem.
>>>>  For
>>>>
>>>>> instance, /var/storage/primary/4d324e1a-e3a6-4da8-9c4d-44ad723482ad is
>>>>> the
>>>>> root volume for an instance I want to back up.  Is this in qcow2 format
>>>>> or
>>>>> something else?  I'm using KVM.
>>>>>
>>>>>
>>>>> On Tue, Apr 29, 2014 at 7:38 PM, Ian Young 
>>>>> wrote:
>>>>>
>>>>>   Now I can't start cloudstack-agent.  The agent.log says:
>>>>>
>>>>>> Unable to start agent: Failed to get private nic name
>>>>>>
>>>>>> I know this is because the network bridge is no longer set up
>>>>>> correctly.
>>>>>>I used to have a cloud0 and a cloudbr0 interface.  Now I only have
>>>>>> cloudbr0.  I haven't changed my network configuration.  Somehow it's
>>>>>> been
>>>>>> changed by CloudStack during the upgrade/downgrade.  This is getting
>>>>>> worse
>>>>>> and worse the more I try to recover my data.  Is there any way to back
>>>>>> up
>>>>>> the instances' volumes via the command line?  I can't tell which is
>>>>>> which
>>>>>> because the filenames are all hashes.  I really need to get these
>>>>>> instances
>>>>>> up and running--there are several months worth of work at stake here.
>>>>>>
>>>>>>
>>>>>> On Tue, Apr 29, 2014 at 6:13 PM, ma y  wrote:
>>>>>>
>>>>>>   I got the same problem, and how to downgrade CS 4.3.0 to 4.2.1
>>>>>> safely?
>>>>>>
>>>>>>>
>>>>>>> 2014-04-30 8:45 GMT+08:00 Ian Young :
>>>>>>>
>>>>>>>   Ok, my Cloudstack installation is now so broken that I think it's
>>>>>>> probably
>>>>>>>
>>>>>>>  best to backup all my instances and templates, wipe the databases,
>>&g

Re: failed to start virtual router

2014-04-30 Thread Ian Young
Yes, I restored the DB from the backup.  When I try to start the router it
says:

Resource [Host:1] is unreachable: Host 1: Unable to start instance due to
Unable to start VM[DomainRouter|r-63-VM] due to error in finalizeStart, not
retrying

The management server log says:

2014-04-30 12:20:52,485 ERROR [cloud.async.AsyncJobManagerImpl]
(Job-Executor-7:job-520 = [ 9d7c898c-b5d0-4bd0-a711-563a91d7acc9 ])
Unexpected exception while executing
org.apache.cloudstack.api.command.admin.router.StartRouterCmd
com.cloud.exception.AgentUnavailableException: Resource [Host:1] is
unreachable: Host 1: Unable to start instance due to Unable to start
VM[DomainRouter|r-63-VM] due to error in finalizeStart, not retrying


On Wed, Apr 30, 2014 at 12:02 PM, stevenliang  wrote:

> I think you had backed up database, when you upgraded.
> When you downgraded CS, you also need to restore DB.
>
>
> On 30/04/14 02:58 PM, Ian Young wrote:
>
>> I think my problem stems from a partially downloaded system VM template.
>>  I
>> just noticed systemvm-kvm-4.3 is stuck at 27% downloaded.  It must have
>> been interrupted during the upgrade to 4.3.  At the moment I've rolled
>> back
>> to 4.2.1 with a somewhat usable management interface, although the system
>> VMs won't start.  I suspect there is something in the database that is
>> causing it to try to use the 4.3 template.  How can I delete the template
>> and make sure the management server is using the older one?
>>
>>
>> On Tue, Apr 29, 2014 at 8:23 PM, Ian Young 
>> wrote:
>>
>>  Ok, so I've figured out a way to identify volumes in the filesystem.  For
>>> instance, /var/storage/primary/4d324e1a-e3a6-4da8-9c4d-44ad723482ad is
>>> the
>>> root volume for an instance I want to back up.  Is this in qcow2 format
>>> or
>>> something else?  I'm using KVM.
>>>
>>>
>>> On Tue, Apr 29, 2014 at 7:38 PM, Ian Young 
>>> wrote:
>>>
>>>  Now I can't start cloudstack-agent.  The agent.log says:
>>>>
>>>> Unable to start agent: Failed to get private nic name
>>>>
>>>> I know this is because the network bridge is no longer set up correctly.
>>>>   I used to have a cloud0 and a cloudbr0 interface.  Now I only have
>>>> cloudbr0.  I haven't changed my network configuration.  Somehow it's
>>>> been
>>>> changed by CloudStack during the upgrade/downgrade.  This is getting
>>>> worse
>>>> and worse the more I try to recover my data.  Is there any way to back
>>>> up
>>>> the instances' volumes via the command line?  I can't tell which is
>>>> which
>>>> because the filenames are all hashes.  I really need to get these
>>>> instances
>>>> up and running--there are several months worth of work at stake here.
>>>>
>>>>
>>>> On Tue, Apr 29, 2014 at 6:13 PM, ma y  wrote:
>>>>
>>>>  I got the same problem, and how to downgrade CS 4.3.0 to 4.2.1 safely?
>>>>>
>>>>>
>>>>> 2014-04-30 8:45 GMT+08:00 Ian Young :
>>>>>
>>>>>  Ok, my Cloudstack installation is now so broken that I think it's
>>>>>>
>>>>> probably
>>>>>
>>>>>> best to backup all my instances and templates, wipe the databases, and
>>>>>> start from scratch.  However, I can't take snapshots or download
>>>>>>
>>>>> volumes
>>>>>
>>>>>> anymore.  What's causing these errors?
>>>>>>
>>>>>> 2014-04-29 17:40:51,264 DEBUG [o.a.c.s.m.AncientDataMotionStrategy]
>>>>>> (Job-Executor-11:ctx-0a3ead79 ctx-315eda05) copy object failed:
>>>>>> com.cloud.utils.exception.CloudRuntimeException: Failed to send
>>>>>>
>>>>> command,
>>>>>
>>>>>> due to Agent:1, com.cloud.exception.OperationTimedoutException:
>>>>>>
>>>>> Commands
>>>>>
>>>>>> 841744457 to Host 1 timed out after 21600
>>>>>> 2014-04-29 17:40:51,265 DEBUG [o.a.c.s.m.AncientDataMotionStrategy]
>>>>>> (Job-Executor-11:ctx-0a3ead79 ctx-315eda05) copy failed
>>>>>> com.cloud.utils.exception.CloudRuntimeException:
>>>>>> com.cloud.utils.exception.CloudRuntimeException: Failed to send
>>>>>>
>>>>> command,
>>>>>

Re: failed to start virtual router

2014-04-30 Thread Ian Young
I think my problem stems from a partially downloaded system VM template.  I
just noticed systemvm-kvm-4.3 is stuck at 27% downloaded.  It must have
been interrupted during the upgrade to 4.3.  At the moment I've rolled back
to 4.2.1 with a somewhat usable management interface, although the system
VMs won't start.  I suspect there is something in the database that is
causing it to try to use the 4.3 template.  How can I delete the template
and make sure the management server is using the older one?


On Tue, Apr 29, 2014 at 8:23 PM, Ian Young  wrote:

> Ok, so I've figured out a way to identify volumes in the filesystem.  For
> instance, /var/storage/primary/4d324e1a-e3a6-4da8-9c4d-44ad723482ad is the
> root volume for an instance I want to back up.  Is this in qcow2 format or
> something else?  I'm using KVM.
>
>
> On Tue, Apr 29, 2014 at 7:38 PM, Ian Young  wrote:
>
>> Now I can't start cloudstack-agent.  The agent.log says:
>>
>> Unable to start agent: Failed to get private nic name
>>
>> I know this is because the network bridge is no longer set up correctly.
>>  I used to have a cloud0 and a cloudbr0 interface.  Now I only have
>> cloudbr0.  I haven't changed my network configuration.  Somehow it's been
>> changed by CloudStack during the upgrade/downgrade.  This is getting worse
>> and worse the more I try to recover my data.  Is there any way to back up
>> the instances' volumes via the command line?  I can't tell which is which
>> because the filenames are all hashes.  I really need to get these instances
>> up and running--there are several months worth of work at stake here.
>>
>>
>> On Tue, Apr 29, 2014 at 6:13 PM, ma y  wrote:
>>
>>> I got the same problem, and how to downgrade CS 4.3.0 to 4.2.1 safely?
>>>
>>>
>>> 2014-04-30 8:45 GMT+08:00 Ian Young :
>>>
>>> > Ok, my Cloudstack installation is now so broken that I think it's
>>> probably
>>> > best to backup all my instances and templates, wipe the databases, and
>>> > start from scratch.  However, I can't take snapshots or download
>>> volumes
>>> > anymore.  What's causing these errors?
>>> >
>>> > 2014-04-29 17:40:51,264 DEBUG [o.a.c.s.m.AncientDataMotionStrategy]
>>> > (Job-Executor-11:ctx-0a3ead79 ctx-315eda05) copy object failed:
>>> > com.cloud.utils.exception.CloudRuntimeException: Failed to send
>>> command,
>>> > due to Agent:1, com.cloud.exception.OperationTimedoutException:
>>> Commands
>>> > 841744457 to Host 1 timed out after 21600
>>> > 2014-04-29 17:40:51,265 DEBUG [o.a.c.s.m.AncientDataMotionStrategy]
>>> > (Job-Executor-11:ctx-0a3ead79 ctx-315eda05) copy failed
>>> > com.cloud.utils.exception.CloudRuntimeException:
>>> > com.cloud.utils.exception.CloudRuntimeException: Failed to send
>>> command,
>>> > due to Agent:1, com.cloud.exception.OperationTimedoutException:
>>> Commands
>>> > 841744457 to Host 1 timed out after 21600
>>> > 2014-04-29 17:40:51,269 WARN  [o.a.c.s.d.ObjectInDataStoreManagerImpl]
>>> > (Job-Executor-11:ctx-0a3ead79 ctx-315eda05) Unsupported data object
>>> > (VOLUME,
>>> > org.apache.cloudstack.storage.datastore.PrimaryDataStoreImpl@7bbbd901),
>>> no
>>> > need to delete from object in store ref table
>>> > 2014-04-29 17:40:51,280 ERROR [c.c.a.ApiAsyncJobDispatcher]
>>> > (Job-Executor-11:ctx-0a3ead79) Unexpected exception while executing
>>> > org.apache.cloudstack.api.command.user.volume.ExtractVolumeCmd
>>> > com.cloud.utils.exception.CloudRuntimeException: Failed to copy the
>>> volume
>>> > from the source primary storage pool to secondary storage.
>>> > 2014-04-29 17:40:51,282 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
>>> > (Job-Executor-11:ctx-0a3ead79) Complete async job-501, jobStatus:
>>> FAILED,
>>> > resultCode: 530, result:
>>> >
>>> >
>>> org.apache.cloudstack.api.response.ExceptionResponse/null/{"uuidList":[],"errorcode":530,"errortext":"Failed
>>> > to copy the volume from the source primary storage pool to secondary
>>> > storage."}
>>> >
>>> >
>>> > On Tue, Apr 29, 2014 at 4:15 PM, Ian Young 
>>> wrote:
>>> >
>>> > > I downgraded to 4.2.1 again but cloudstack-management won't start
>>> because
>>> > > the database is version 4.3. Is it safe to restore t

Re: failed to start virtual router

2014-04-29 Thread Ian Young
Ok, so I've figured out a way to identify volumes in the filesystem.  For
instance, /var/storage/primary/4d324e1a-e3a6-4da8-9c4d-44ad723482ad is the
root volume for an instance I want to back up.  Is this in qcow2 format or
something else?  I'm using KVM.


On Tue, Apr 29, 2014 at 7:38 PM, Ian Young  wrote:

> Now I can't start cloudstack-agent.  The agent.log says:
>
> Unable to start agent: Failed to get private nic name
>
> I know this is because the network bridge is no longer set up correctly.
>  I used to have a cloud0 and a cloudbr0 interface.  Now I only have
> cloudbr0.  I haven't changed my network configuration.  Somehow it's been
> changed by CloudStack during the upgrade/downgrade.  This is getting worse
> and worse the more I try to recover my data.  Is there any way to back up
> the instances' volumes via the command line?  I can't tell which is which
> because the filenames are all hashes.  I really need to get these instances
> up and running--there are several months worth of work at stake here.
>
>
> On Tue, Apr 29, 2014 at 6:13 PM, ma y  wrote:
>
>> I got the same problem, and how to downgrade CS 4.3.0 to 4.2.1 safely?
>>
>>
>> 2014-04-30 8:45 GMT+08:00 Ian Young :
>>
>> > Ok, my Cloudstack installation is now so broken that I think it's
>> probably
>> > best to backup all my instances and templates, wipe the databases, and
>> > start from scratch.  However, I can't take snapshots or download volumes
>> > anymore.  What's causing these errors?
>> >
>> > 2014-04-29 17:40:51,264 DEBUG [o.a.c.s.m.AncientDataMotionStrategy]
>> > (Job-Executor-11:ctx-0a3ead79 ctx-315eda05) copy object failed:
>> > com.cloud.utils.exception.CloudRuntimeException: Failed to send command,
>> > due to Agent:1, com.cloud.exception.OperationTimedoutException: Commands
>> > 841744457 to Host 1 timed out after 21600
>> > 2014-04-29 17:40:51,265 DEBUG [o.a.c.s.m.AncientDataMotionStrategy]
>> > (Job-Executor-11:ctx-0a3ead79 ctx-315eda05) copy failed
>> > com.cloud.utils.exception.CloudRuntimeException:
>> > com.cloud.utils.exception.CloudRuntimeException: Failed to send command,
>> > due to Agent:1, com.cloud.exception.OperationTimedoutException: Commands
>> > 841744457 to Host 1 timed out after 21600
>> > 2014-04-29 17:40:51,269 WARN  [o.a.c.s.d.ObjectInDataStoreManagerImpl]
>> > (Job-Executor-11:ctx-0a3ead79 ctx-315eda05) Unsupported data object
>> > (VOLUME,
>> > org.apache.cloudstack.storage.datastore.PrimaryDataStoreImpl@7bbbd901),
>> no
>> > need to delete from object in store ref table
>> > 2014-04-29 17:40:51,280 ERROR [c.c.a.ApiAsyncJobDispatcher]
>> > (Job-Executor-11:ctx-0a3ead79) Unexpected exception while executing
>> > org.apache.cloudstack.api.command.user.volume.ExtractVolumeCmd
>> > com.cloud.utils.exception.CloudRuntimeException: Failed to copy the
>> volume
>> > from the source primary storage pool to secondary storage.
>> > 2014-04-29 17:40:51,282 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
>> > (Job-Executor-11:ctx-0a3ead79) Complete async job-501, jobStatus:
>> FAILED,
>> > resultCode: 530, result:
>> >
>> >
>> org.apache.cloudstack.api.response.ExceptionResponse/null/{"uuidList":[],"errorcode":530,"errortext":"Failed
>> > to copy the volume from the source primary storage pool to secondary
>> > storage."}
>> >
>> >
>> > On Tue, Apr 29, 2014 at 4:15 PM, Ian Young 
>> wrote:
>> >
>> > > I downgraded to 4.2.1 again but cloudstack-management won't start
>> because
>> > > the database is version 4.3. Is it safe to restore the database
>> backup I
>> > > made prior to this whole process? In the meantime I have destroyed and
>> > > created system VMs, so I'm not sure it's a good idea.
>> > > On Apr 29, 2014 3:09 PM, "Ian Young"  wrote:
>> > >
>> > >> @stevenliang: I take it back--you can't set the VM size when you
>> > register
>> > >> the template.
>> > >>
>> > >>
>> > >> On Tue, Apr 29, 2014 at 3:02 PM, motty cruz 
>> > wrote:
>> > >>
>> > >>> yes, you would have to shutdown the router, then click on "Change
>> > Service
>> > >>> Offering"
>> > >>> restart the VR.
>> > >>>
>> > >>> To Ian,
>> > >>>
>> > >>> I suspe

Re: failed to start virtual router

2014-04-29 Thread Ian Young
Now I can't start cloudstack-agent.  The agent.log says:

Unable to start agent: Failed to get private nic name

I know this is because the network bridge is no longer set up correctly.  I
used to have a cloud0 and a cloudbr0 interface.  Now I only have cloudbr0.
 I haven't changed my network configuration.  Somehow it's been changed by
CloudStack during the upgrade/downgrade.  This is getting worse and worse
the more I try to recover my data.  Is there any way to back up the
instances' volumes via the command line?  I can't tell which is which
because the filenames are all hashes.  I really need to get these instances
up and running--there are several months worth of work at stake here.


On Tue, Apr 29, 2014 at 6:13 PM, ma y  wrote:

> I got the same problem, and how to downgrade CS 4.3.0 to 4.2.1 safely?
>
>
> 2014-04-30 8:45 GMT+08:00 Ian Young :
>
> > Ok, my Cloudstack installation is now so broken that I think it's
> probably
> > best to backup all my instances and templates, wipe the databases, and
> > start from scratch.  However, I can't take snapshots or download volumes
> > anymore.  What's causing these errors?
> >
> > 2014-04-29 17:40:51,264 DEBUG [o.a.c.s.m.AncientDataMotionStrategy]
> > (Job-Executor-11:ctx-0a3ead79 ctx-315eda05) copy object failed:
> > com.cloud.utils.exception.CloudRuntimeException: Failed to send command,
> > due to Agent:1, com.cloud.exception.OperationTimedoutException: Commands
> > 841744457 to Host 1 timed out after 21600
> > 2014-04-29 17:40:51,265 DEBUG [o.a.c.s.m.AncientDataMotionStrategy]
> > (Job-Executor-11:ctx-0a3ead79 ctx-315eda05) copy failed
> > com.cloud.utils.exception.CloudRuntimeException:
> > com.cloud.utils.exception.CloudRuntimeException: Failed to send command,
> > due to Agent:1, com.cloud.exception.OperationTimedoutException: Commands
> > 841744457 to Host 1 timed out after 21600
> > 2014-04-29 17:40:51,269 WARN  [o.a.c.s.d.ObjectInDataStoreManagerImpl]
> > (Job-Executor-11:ctx-0a3ead79 ctx-315eda05) Unsupported data object
> > (VOLUME,
> > org.apache.cloudstack.storage.datastore.PrimaryDataStoreImpl@7bbbd901),
> no
> > need to delete from object in store ref table
> > 2014-04-29 17:40:51,280 ERROR [c.c.a.ApiAsyncJobDispatcher]
> > (Job-Executor-11:ctx-0a3ead79) Unexpected exception while executing
> > org.apache.cloudstack.api.command.user.volume.ExtractVolumeCmd
> > com.cloud.utils.exception.CloudRuntimeException: Failed to copy the
> volume
> > from the source primary storage pool to secondary storage.
> > 2014-04-29 17:40:51,282 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
> > (Job-Executor-11:ctx-0a3ead79) Complete async job-501, jobStatus: FAILED,
> > resultCode: 530, result:
> >
> >
> org.apache.cloudstack.api.response.ExceptionResponse/null/{"uuidList":[],"errorcode":530,"errortext":"Failed
> > to copy the volume from the source primary storage pool to secondary
> > storage."}
> >
> >
> > On Tue, Apr 29, 2014 at 4:15 PM, Ian Young 
> wrote:
> >
> > > I downgraded to 4.2.1 again but cloudstack-management won't start
> because
> > > the database is version 4.3. Is it safe to restore the database backup
> I
> > > made prior to this whole process? In the meantime I have destroyed and
> > > created system VMs, so I'm not sure it's a good idea.
> > > On Apr 29, 2014 3:09 PM, "Ian Young"  wrote:
> > >
> > >> @stevenliang: I take it back--you can't set the VM size when you
> > register
> > >> the template.
> > >>
> > >>
> > >> On Tue, Apr 29, 2014 at 3:02 PM, motty cruz 
> > wrote:
> > >>
> > >>> yes, you would have to shutdown the router, then click on "Change
> > Service
> > >>> Offering"
> > >>> restart the VR.
> > >>>
> > >>> To Ian,
> > >>>
> > >>> I suspect you forgot the last step: " cloudstack-setup-management"
> > >>>
> > >>> that would fix your issue, I think,
> > >>>
> > >>> Thanks,
> > >>> ---
> > >>> I downgraded to 4.2.1 and then upgraded to 4.3.  Now the
> > >>> cloudstack-management service can't start because it can't connect to
> > the
> > >>> database.
> > >>>
> > >>> 2014-04-29 14:51:36,424 ERROR [c.c.u.d.Merovingian2] (main:null)
> Unable
> > >>> to
> > >>> get a new db connection
> > 

Re: failed to start virtual router

2014-04-29 Thread Ian Young
Ok, my Cloudstack installation is now so broken that I think it's probably
best to backup all my instances and templates, wipe the databases, and
start from scratch.  However, I can't take snapshots or download volumes
anymore.  What's causing these errors?

2014-04-29 17:40:51,264 DEBUG [o.a.c.s.m.AncientDataMotionStrategy]
(Job-Executor-11:ctx-0a3ead79 ctx-315eda05) copy object failed:
com.cloud.utils.exception.CloudRuntimeException: Failed to send command,
due to Agent:1, com.cloud.exception.OperationTimedoutException: Commands
841744457 to Host 1 timed out after 21600
2014-04-29 17:40:51,265 DEBUG [o.a.c.s.m.AncientDataMotionStrategy]
(Job-Executor-11:ctx-0a3ead79 ctx-315eda05) copy failed
com.cloud.utils.exception.CloudRuntimeException:
com.cloud.utils.exception.CloudRuntimeException: Failed to send command,
due to Agent:1, com.cloud.exception.OperationTimedoutException: Commands
841744457 to Host 1 timed out after 21600
2014-04-29 17:40:51,269 WARN  [o.a.c.s.d.ObjectInDataStoreManagerImpl]
(Job-Executor-11:ctx-0a3ead79 ctx-315eda05) Unsupported data object
(VOLUME,
org.apache.cloudstack.storage.datastore.PrimaryDataStoreImpl@7bbbd901), no
need to delete from object in store ref table
2014-04-29 17:40:51,280 ERROR [c.c.a.ApiAsyncJobDispatcher]
(Job-Executor-11:ctx-0a3ead79) Unexpected exception while executing
org.apache.cloudstack.api.command.user.volume.ExtractVolumeCmd
com.cloud.utils.exception.CloudRuntimeException: Failed to copy the volume
from the source primary storage pool to secondary storage.
2014-04-29 17:40:51,282 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
(Job-Executor-11:ctx-0a3ead79) Complete async job-501, jobStatus: FAILED,
resultCode: 530, result:
org.apache.cloudstack.api.response.ExceptionResponse/null/{"uuidList":[],"errorcode":530,"errortext":"Failed
to copy the volume from the source primary storage pool to secondary
storage."}


On Tue, Apr 29, 2014 at 4:15 PM, Ian Young  wrote:

> I downgraded to 4.2.1 again but cloudstack-management won't start because
> the database is version 4.3. Is it safe to restore the database backup I
> made prior to this whole process? In the meantime I have destroyed and
> created system VMs, so I'm not sure it's a good idea.
> On Apr 29, 2014 3:09 PM, "Ian Young"  wrote:
>
>> @stevenliang: I take it back--you can't set the VM size when you register
>> the template.
>>
>>
>> On Tue, Apr 29, 2014 at 3:02 PM, motty cruz  wrote:
>>
>>> yes, you would have to shutdown the router, then click on "Change Service
>>> Offering"
>>> restart the VR.
>>>
>>> To Ian,
>>>
>>> I suspect you forgot the last step: " cloudstack-setup-management"
>>>
>>> that would fix your issue, I think,
>>>
>>> Thanks,
>>> ---
>>> I downgraded to 4.2.1 and then upgraded to 4.3.  Now the
>>> cloudstack-management service can't start because it can't connect to the
>>> database.
>>>
>>> 2014-04-29 14:51:36,424 ERROR [c.c.u.d.Merovingian2] (main:null) Unable
>>> to
>>> get a new db connection
>>> Caused by: java.sql.SQLException: Access denied for user 'cloud'@
>>> 'localhost'
>>> (using password: YES)
>>>
>>> Where are the credentials stored?
>>>
>>>
>>> On Tue, Apr 29, 2014 at 2:57 PM, stevenliang 
>>> wrote:
>>>
>>> > oh, then change service offering for vr?
>>> >
>>> >
>>> > On 29/04/14 05:53 PM, motty cruz wrote:
>>> >
>>> >> for my VR, I created a new
>>> >>
>>> >>   "System Offering For Software Router"
>>> >> CPU in (MHz) 1.00GHz
>>> >> Memory (in MB) 1.00GB
>>> >>
>>> >> this are my current offerings, I'm sure the more RAM and CPU better
>>> >> performance.
>>> >>
>>> >> Thanks,
>>> >>
>>> >>
>>> >>
>>> >> On Tue, Apr 29, 2014 at 2:44 PM, stevenliang 
>>> >> wrote:
>>> >>
>>> >>  Thank you again, motty.
>>> >>> I didn't notice this earlier.
>>> >>> BTW, how did you make your vr had 1GB CPU and 512MB RAM?
>>> >>>
>>> >>>
>>> >>>
>>> >>> On 29/04/14 05:33 PM, motty cruz wrote:
>>> >>>
>>> >>>  Stevellang,
>>> >>>> I not sure if you saw this in the forums earlier :
>>> >>>>
>>> http://mail-archives.apache.org/mod_mbox

Re: failed to start virtual router

2014-04-29 Thread Ian Young
I downgraded to 4.2.1 again but cloudstack-management won't start because
the database is version 4.3. Is it safe to restore the database backup I
made prior to this whole process? In the meantime I have destroyed and
created system VMs, so I'm not sure it's a good idea.
On Apr 29, 2014 3:09 PM, "Ian Young"  wrote:

> @stevenliang: I take it back--you can't set the VM size when you register
> the template.
>
>
> On Tue, Apr 29, 2014 at 3:02 PM, motty cruz  wrote:
>
>> yes, you would have to shutdown the router, then click on "Change Service
>> Offering"
>> restart the VR.
>>
>> To Ian,
>>
>> I suspect you forgot the last step: " cloudstack-setup-management"
>>
>> that would fix your issue, I think,
>>
>> Thanks,
>> ---
>> I downgraded to 4.2.1 and then upgraded to 4.3.  Now the
>> cloudstack-management service can't start because it can't connect to the
>> database.
>>
>> 2014-04-29 14:51:36,424 ERROR [c.c.u.d.Merovingian2] (main:null) Unable to
>> get a new db connection
>> Caused by: java.sql.SQLException: Access denied for user 'cloud'@
>> 'localhost'
>> (using password: YES)
>>
>> Where are the credentials stored?
>>
>>
>> On Tue, Apr 29, 2014 at 2:57 PM, stevenliang 
>> wrote:
>>
>> > oh, then change service offering for vr?
>> >
>> >
>> > On 29/04/14 05:53 PM, motty cruz wrote:
>> >
>> >> for my VR, I created a new
>> >>
>> >>   "System Offering For Software Router"
>> >> CPU in (MHz) 1.00GHz
>> >> Memory (in MB) 1.00GB
>> >>
>> >> this are my current offerings, I'm sure the more RAM and CPU better
>> >> performance.
>> >>
>> >> Thanks,
>> >>
>> >>
>> >>
>> >> On Tue, Apr 29, 2014 at 2:44 PM, stevenliang 
>> >> wrote:
>> >>
>> >>  Thank you again, motty.
>> >>> I didn't notice this earlier.
>> >>> BTW, how did you make your vr had 1GB CPU and 512MB RAM?
>> >>>
>> >>>
>> >>>
>> >>> On 29/04/14 05:33 PM, motty cruz wrote:
>> >>>
>> >>>  Stevellang,
>> >>>> I not sure if you saw this in the forums earlier :
>> >>>>
>> http://mail-archives.apache.org/mod_mbox/cloudstack-users/201404.mbox/%
>> >>>> 3CCALoOYy6A10bz1zOQQs1VyFb9epqLfhf7mu6hc=c2rfedroy...@mail.gmail.com
>> %3E
>> >>>>
>> >>>> I don't know if the bug was fixed yet,
>> >>>>
>> >>>> I will try upgrade in the next couple of days on a testing cluster,
>> will
>> >>>> report back if the bug was fixed.
>> >>>>
>> >>>> Thanks,
>> >>>>
>> >>>>
>> >>>> On Tue, Apr 29, 2014 at 2:25 PM, stevenliang 
>> >>>> wrote:
>> >>>>
>> >>>>   Thank you, motty.
>> >>>>
>> >>>>> I am also running kvm. Since that time I failed upgrade, I am still
>> >>>>> using
>> >>>>> 4.2.1. I'll try as your advice.
>> >>>>>
>> >>>>>
>> >>>>> On 29/04/14 05:19 PM, motty cruz wrote:
>> >>>>>
>> >>>>>   Stevenllang,
>> >>>>>
>> >>>>>> I had the similar issue with VR, I notice it was because I leave
>> the
>> >>>>>> default system specs on the VR, for instance by default 500MHz on
>> CPU
>> >>>>>> and
>> >>>>>> 128MB on RAM, if you upgrade to at least 1GB on CPU and 512MB of
>> RAM
>> >>>>>> your
>> >>>>>> VR will survive the upgrade from 4.2.1 to 4.3.1.
>> >>>>>>
>> >>>>>> I am running KVM, when I upgrade from 4.2.1 to 4.3 my VMs were not
>> >>>>>> able
>> >>>>>> to
>> >>>>>> access outside world, even if I created a new router.
>> >>>>>>
>> >>>>>> wish you the best,
>> >>>>>> -motty
>> >>>>>>
>> >>>>>>
>> >>>>>> On Tue, Apr 29, 2014 at 2:13 PM, stevenliang <
>> stevenli...@yesup.com>
>> >

Re: failed to start virtual router

2014-04-29 Thread Ian Young
@stevenliang: I take it back--you can't set the VM size when you register
the template.


On Tue, Apr 29, 2014 at 3:02 PM, motty cruz  wrote:

> yes, you would have to shutdown the router, then click on "Change Service
> Offering"
> restart the VR.
>
> To Ian,
>
> I suspect you forgot the last step: " cloudstack-setup-management"
>
> that would fix your issue, I think,
>
> Thanks,
> ---
> I downgraded to 4.2.1 and then upgraded to 4.3.  Now the
> cloudstack-management service can't start because it can't connect to the
> database.
>
> 2014-04-29 14:51:36,424 ERROR [c.c.u.d.Merovingian2] (main:null) Unable to
> get a new db connection
> Caused by: java.sql.SQLException: Access denied for user 'cloud'@
> 'localhost'
> (using password: YES)
>
> Where are the credentials stored?
>
>
> On Tue, Apr 29, 2014 at 2:57 PM, stevenliang 
> wrote:
>
> > oh, then change service offering for vr?
> >
> >
> > On 29/04/14 05:53 PM, motty cruz wrote:
> >
> >> for my VR, I created a new
> >>
> >>   "System Offering For Software Router"
> >> CPU in (MHz) 1.00GHz
> >> Memory (in MB) 1.00GB
> >>
> >> this are my current offerings, I'm sure the more RAM and CPU better
> >> performance.
> >>
> >> Thanks,
> >>
> >>
> >>
> >> On Tue, Apr 29, 2014 at 2:44 PM, stevenliang 
> >> wrote:
> >>
> >>  Thank you again, motty.
> >>> I didn't notice this earlier.
> >>> BTW, how did you make your vr had 1GB CPU and 512MB RAM?
> >>>
> >>>
> >>>
> >>> On 29/04/14 05:33 PM, motty cruz wrote:
> >>>
> >>>  Stevellang,
> >>>> I not sure if you saw this in the forums earlier :
> >>>>
> http://mail-archives.apache.org/mod_mbox/cloudstack-users/201404.mbox/%
> >>>> 3CCALoOYy6A10bz1zOQQs1VyFb9epqLfhf7mu6hc=c2rfedroy...@mail.gmail.com
> %3E
> >>>>
> >>>> I don't know if the bug was fixed yet,
> >>>>
> >>>> I will try upgrade in the next couple of days on a testing cluster,
> will
> >>>> report back if the bug was fixed.
> >>>>
> >>>> Thanks,
> >>>>
> >>>>
> >>>> On Tue, Apr 29, 2014 at 2:25 PM, stevenliang 
> >>>> wrote:
> >>>>
> >>>>   Thank you, motty.
> >>>>
> >>>>> I am also running kvm. Since that time I failed upgrade, I am still
> >>>>> using
> >>>>> 4.2.1. I'll try as your advice.
> >>>>>
> >>>>>
> >>>>> On 29/04/14 05:19 PM, motty cruz wrote:
> >>>>>
> >>>>>   Stevenllang,
> >>>>>
> >>>>>> I had the similar issue with VR, I notice it was because I leave the
> >>>>>> default system specs on the VR, for instance by default 500MHz on
> CPU
> >>>>>> and
> >>>>>> 128MB on RAM, if you upgrade to at least 1GB on CPU and 512MB of RAM
> >>>>>> your
> >>>>>> VR will survive the upgrade from 4.2.1 to 4.3.1.
> >>>>>>
> >>>>>> I am running KVM, when I upgrade from 4.2.1 to 4.3 my VMs were not
> >>>>>> able
> >>>>>> to
> >>>>>> access outside world, even if I created a new router.
> >>>>>>
> >>>>>> wish you the best,
> >>>>>> -motty
> >>>>>>
> >>>>>>
> >>>>>> On Tue, Apr 29, 2014 at 2:13 PM, stevenliang  >
> >>>>>> wrote:
> >>>>>>
> >>>>>>Yes, I had two zones(one is basic, another is advanced mode).
> >>>>>>
> >>>>>>  After I upgraded from 4.2.1 to 4.3, the vrouter lost.
> >>>>>>> So I rolled back to 4.2.1, the vrouter came back.
> >>>>>>>
> >>>>>>>
> >>>>>>> On 29/04/14 04:54 PM, Ian Young wrote:
> >>>>>>>
> >>>>>>>Did rolling back to 4.2 fix the problem?
> >>>>>>>
> >>>>>>>  On Tue, Apr 29, 2014 at 1:22 PM, stevenliang <
> stevenli...@yesup.com
> >>>>>>>> >
> >>>>>>>> wrote:
> 

Re: failed to start virtual router

2014-04-29 Thread Ian Young
I downgraded to 4.2.1 and then upgraded to 4.3.  Now the
cloudstack-management service can't start because it can't connect to the
database.

2014-04-29 14:51:36,424 ERROR [c.c.u.d.Merovingian2] (main:null) Unable to
get a new db connection
Caused by: java.sql.SQLException: Access denied for user 'cloud'@'localhost'
(using password: YES)

Where are the credentials stored?


On Tue, Apr 29, 2014 at 2:55 PM, Ian Young  wrote:

> I think you can do that when you register the new templates in step 1 of
> this guide:
>
>
> http://docs.cloudstack.apache.org/projects/cloudstack-release-notes/en/latest/rnotes.html#upgrade-from-4-2-x-to-4-3
>
>
> On Tue, Apr 29, 2014 at 2:53 PM, motty cruz  wrote:
>
>> for my VR, I created a new
>>
>>  "System Offering For Software Router"
>> CPU in (MHz) 1.00GHz
>> Memory (in MB) 1.00GB
>>
>> this are my current offerings, I'm sure the more RAM and CPU better
>> performance.
>>
>> Thanks,
>>
>>
>>
>> On Tue, Apr 29, 2014 at 2:44 PM, stevenliang 
>> wrote:
>>
>> > Thank you again, motty.
>> > I didn't notice this earlier.
>> > BTW, how did you make your vr had 1GB CPU and 512MB RAM?
>> >
>> >
>> >
>> > On 29/04/14 05:33 PM, motty cruz wrote:
>> >
>> >> Stevellang,
>> >> I not sure if you saw this in the forums earlier :
>> >>
>> http://mail-archives.apache.org/mod_mbox/cloudstack-users/201404.mbox/%
>> >> 3CCALoOYy6A10bz1zOQQs1VyFb9epqLfhf7mu6hc=c2rfedroy...@mail.gmail.com
>> %3E
>> >>
>> >> I don't know if the bug was fixed yet,
>> >>
>> >> I will try upgrade in the next couple of days on a testing cluster,
>> will
>> >> report back if the bug was fixed.
>> >>
>> >> Thanks,
>> >>
>> >>
>> >> On Tue, Apr 29, 2014 at 2:25 PM, stevenliang 
>> >> wrote:
>> >>
>> >>  Thank you, motty.
>> >>> I am also running kvm. Since that time I failed upgrade, I am still
>> using
>> >>> 4.2.1. I'll try as your advice.
>> >>>
>> >>>
>> >>> On 29/04/14 05:19 PM, motty cruz wrote:
>> >>>
>> >>>  Stevenllang,
>> >>>>
>> >>>> I had the similar issue with VR, I notice it was because I leave the
>> >>>> default system specs on the VR, for instance by default 500MHz on CPU
>> >>>> and
>> >>>> 128MB on RAM, if you upgrade to at least 1GB on CPU and 512MB of RAM
>> >>>> your
>> >>>> VR will survive the upgrade from 4.2.1 to 4.3.1.
>> >>>>
>> >>>> I am running KVM, when I upgrade from 4.2.1 to 4.3 my VMs were not
>> able
>> >>>> to
>> >>>> access outside world, even if I created a new router.
>> >>>>
>> >>>> wish you the best,
>> >>>> -motty
>> >>>>
>> >>>>
>> >>>> On Tue, Apr 29, 2014 at 2:13 PM, stevenliang 
>> >>>> wrote:
>> >>>>
>> >>>>   Yes, I had two zones(one is basic, another is advanced mode).
>> >>>>
>> >>>>> After I upgraded from 4.2.1 to 4.3, the vrouter lost.
>> >>>>> So I rolled back to 4.2.1, the vrouter came back.
>> >>>>>
>> >>>>>
>> >>>>> On 29/04/14 04:54 PM, Ian Young wrote:
>> >>>>>
>> >>>>>   Did rolling back to 4.2 fix the problem?
>> >>>>>
>> >>>>>>
>> >>>>>> On Tue, Apr 29, 2014 at 1:22 PM, stevenliang <
>> stevenli...@yesup.com>
>> >>>>>> wrote:
>> >>>>>>
>> >>>>>>I met your situation before. Finally I rolled back to 4.2
>> >>>>>>
>> >>>>>>  On 29/04/14 04:18 PM, Ian Young wrote:
>> >>>>>>>
>> >>>>>>>I destroyed the old virtual router and was able to create a new
>> >>>>>>> one
>> >>>>>>> by
>> >>>>>>>
>> >>>>>>>  adding a new instance.  However, this new router also failed to
>> >>>>>>>> start,
>> >>>>>>>> citing the same error.  After that, the expungement

Re: failed to start virtual router

2014-04-29 Thread Ian Young
I think you can do that when you register the new templates in step 1 of
this guide:

http://docs.cloudstack.apache.org/projects/cloudstack-release-notes/en/latest/rnotes.html#upgrade-from-4-2-x-to-4-3


On Tue, Apr 29, 2014 at 2:53 PM, motty cruz  wrote:

> for my VR, I created a new
>
>  "System Offering For Software Router"
> CPU in (MHz) 1.00GHz
> Memory (in MB) 1.00GB
>
> this are my current offerings, I'm sure the more RAM and CPU better
> performance.
>
> Thanks,
>
>
>
> On Tue, Apr 29, 2014 at 2:44 PM, stevenliang 
> wrote:
>
> > Thank you again, motty.
> > I didn't notice this earlier.
> > BTW, how did you make your vr had 1GB CPU and 512MB RAM?
> >
> >
> >
> > On 29/04/14 05:33 PM, motty cruz wrote:
> >
> >> Stevellang,
> >> I not sure if you saw this in the forums earlier :
> >> http://mail-archives.apache.org/mod_mbox/cloudstack-users/201404.mbox/%
> >> 3CCALoOYy6A10bz1zOQQs1VyFb9epqLfhf7mu6hc=c2rfedroy...@mail.gmail.com%3E
> >>
> >> I don't know if the bug was fixed yet,
> >>
> >> I will try upgrade in the next couple of days on a testing cluster, will
> >> report back if the bug was fixed.
> >>
> >> Thanks,
> >>
> >>
> >> On Tue, Apr 29, 2014 at 2:25 PM, stevenliang 
> >> wrote:
> >>
> >>  Thank you, motty.
> >>> I am also running kvm. Since that time I failed upgrade, I am still
> using
> >>> 4.2.1. I'll try as your advice.
> >>>
> >>>
> >>> On 29/04/14 05:19 PM, motty cruz wrote:
> >>>
> >>>  Stevenllang,
> >>>>
> >>>> I had the similar issue with VR, I notice it was because I leave the
> >>>> default system specs on the VR, for instance by default 500MHz on CPU
> >>>> and
> >>>> 128MB on RAM, if you upgrade to at least 1GB on CPU and 512MB of RAM
> >>>> your
> >>>> VR will survive the upgrade from 4.2.1 to 4.3.1.
> >>>>
> >>>> I am running KVM, when I upgrade from 4.2.1 to 4.3 my VMs were not
> able
> >>>> to
> >>>> access outside world, even if I created a new router.
> >>>>
> >>>> wish you the best,
> >>>> -motty
> >>>>
> >>>>
> >>>> On Tue, Apr 29, 2014 at 2:13 PM, stevenliang 
> >>>> wrote:
> >>>>
> >>>>   Yes, I had two zones(one is basic, another is advanced mode).
> >>>>
> >>>>> After I upgraded from 4.2.1 to 4.3, the vrouter lost.
> >>>>> So I rolled back to 4.2.1, the vrouter came back.
> >>>>>
> >>>>>
> >>>>> On 29/04/14 04:54 PM, Ian Young wrote:
> >>>>>
> >>>>>   Did rolling back to 4.2 fix the problem?
> >>>>>
> >>>>>>
> >>>>>> On Tue, Apr 29, 2014 at 1:22 PM, stevenliang  >
> >>>>>> wrote:
> >>>>>>
> >>>>>>I met your situation before. Finally I rolled back to 4.2
> >>>>>>
> >>>>>>  On 29/04/14 04:18 PM, Ian Young wrote:
> >>>>>>>
> >>>>>>>I destroyed the old virtual router and was able to create a new
> >>>>>>> one
> >>>>>>> by
> >>>>>>>
> >>>>>>>  adding a new instance.  However, this new router also failed to
> >>>>>>>> start,
> >>>>>>>> citing the same error.  After that, the expungement delay elapsed
> >>>>>>>> and
> >>>>>>>> the
> >>>>>>>> virtual router was expunged, so now I have none.
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> On Mon, Apr 28, 2014 at 8:52 PM, Ian Young <
> iyo...@ratespecial.com>
> >>>>>>>> wrote:
> >>>>>>>>
> >>>>>>>> I upgraded from 4.2.1 to 4.3.0 tonight, following the
> >>>>>>>> instructions
> >>>>>>>> here:
> >>>>>>>>
> >>>>>>>>   http://docs.cloudstack.apache.org/projects/cloudstack-
> >>>>>>>>
> >>>>>>>>> release-notes/en/latest/rnotes.html#upgrade-from-4-2-x-to-4-3
> >>>>>>>>>
> >>>>>>>>> At the last step, I tried to restart the system VMs.  The virtual
> >>>>>>>>> router
> >>>>>>>>> failed to start.  Here is the message that was displayed in the
> web
> >>>>>>>>> UI:
> >>>>>>>>>
> >>>>>>>>> Resource [Host:1] is unreachable: Host 1: Unable to start
> instance
> >>>>>>>>> due
> >>>>>>>>> to
> >>>>>>>>> Unable to start VM[DomainRouter|r-4-VM] due to error in
> >>>>>>>>> finalizeStart,
> >>>>>>>>> not
> >>>>>>>>> retrying
> >>>>>>>>>
> >>>>>>>>> I tried running the script to restart the VMs but this time it
> >>>>>>>>> failed
> >>>>>>>>> to
> >>>>>>>>> start the console proxy:
> >>>>>>>>>
> >>>>>>>>> [root@virthost1 ~]$ cloudstack-sysvmadm -d 192.168.100.6 -u
> cloud
> >>>>>>>>> -p
> >>>>>>>>> -a
> >>>>>>>>>
> >>>>>>>>> Stopping and starting 1 secondary storage vm(s)...
> >>>>>>>>> Done stopping and starting secondary storage vm(s)
> >>>>>>>>>
> >>>>>>>>> Stopping and starting 1 console proxy vm(s)...
> >>>>>>>>> ERROR: Failed to start console proxy vm with id 2
> >>>>>>>>>
> >>>>>>>>> Done stopping and starting console proxy vm(s) .
> >>>>>>>>>
> >>>>>>>>> Stopping and starting 0 running routing vm(s)...
> >>>>>>>>>
> >>>>>>>>> Is there a way to wipe the system VMs out and start over?
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >
>


Re: failed to start virtual router

2014-04-29 Thread Ian Young
Did rolling back to 4.2 fix the problem?


On Tue, Apr 29, 2014 at 1:22 PM, stevenliang  wrote:

> I met your situation before. Finally I rolled back to 4.2
>
>
> On 29/04/14 04:18 PM, Ian Young wrote:
>
>> I destroyed the old virtual router and was able to create a new one by
>> adding a new instance.  However, this new router also failed to start,
>> citing the same error.  After that, the expungement delay elapsed and the
>> virtual router was expunged, so now I have none.
>>
>>
>> On Mon, Apr 28, 2014 at 8:52 PM, Ian Young 
>> wrote:
>>
>>  I upgraded from 4.2.1 to 4.3.0 tonight, following the instructions here:
>>>
>>>
>>> http://docs.cloudstack.apache.org/projects/cloudstack-
>>> release-notes/en/latest/rnotes.html#upgrade-from-4-2-x-to-4-3
>>>
>>> At the last step, I tried to restart the system VMs.  The virtual router
>>> failed to start.  Here is the message that was displayed in the web UI:
>>>
>>> Resource [Host:1] is unreachable: Host 1: Unable to start instance due to
>>> Unable to start VM[DomainRouter|r-4-VM] due to error in finalizeStart,
>>> not
>>> retrying
>>>
>>> I tried running the script to restart the VMs but this time it failed to
>>> start the console proxy:
>>>
>>> [root@virthost1 ~]$ cloudstack-sysvmadm -d 192.168.100.6 -u cloud -p -a
>>>
>>> Stopping and starting 1 secondary storage vm(s)...
>>> Done stopping and starting secondary storage vm(s)
>>>
>>> Stopping and starting 1 console proxy vm(s)...
>>> ERROR: Failed to start console proxy vm with id 2
>>>
>>> Done stopping and starting console proxy vm(s) .
>>>
>>> Stopping and starting 0 running routing vm(s)...
>>>
>>> Is there a way to wipe the system VMs out and start over?
>>>
>>>
>


Re: failed to start virtual router

2014-04-29 Thread Ian Young
I destroyed the old virtual router and was able to create a new one by
adding a new instance.  However, this new router also failed to start,
citing the same error.  After that, the expungement delay elapsed and the
virtual router was expunged, so now I have none.


On Mon, Apr 28, 2014 at 8:52 PM, Ian Young  wrote:

> I upgraded from 4.2.1 to 4.3.0 tonight, following the instructions here:
>
>
> http://docs.cloudstack.apache.org/projects/cloudstack-release-notes/en/latest/rnotes.html#upgrade-from-4-2-x-to-4-3
>
> At the last step, I tried to restart the system VMs.  The virtual router
> failed to start.  Here is the message that was displayed in the web UI:
>
> Resource [Host:1] is unreachable: Host 1: Unable to start instance due to
> Unable to start VM[DomainRouter|r-4-VM] due to error in finalizeStart, not
> retrying
>
> I tried running the script to restart the VMs but this time it failed to
> start the console proxy:
>
> [root@virthost1 ~]$ cloudstack-sysvmadm -d 192.168.100.6 -u cloud -p -a
>
> Stopping and starting 1 secondary storage vm(s)...
> Done stopping and starting secondary storage vm(s)
>
> Stopping and starting 1 console proxy vm(s)...
> ERROR: Failed to start console proxy vm with id 2
>
> Done stopping and starting console proxy vm(s) .
>
> Stopping and starting 0 running routing vm(s)...
>
> Is there a way to wipe the system VMs out and start over?
>


failed to start virtual router

2014-04-28 Thread Ian Young
I upgraded from 4.2.1 to 4.3.0 tonight, following the instructions here:

http://docs.cloudstack.apache.org/projects/cloudstack-release-notes/en/latest/rnotes.html#upgrade-from-4-2-x-to-4-3

At the last step, I tried to restart the system VMs.  The virtual router
failed to start.  Here is the message that was displayed in the web UI:

Resource [Host:1] is unreachable: Host 1: Unable to start instance due to
Unable to start VM[DomainRouter|r-4-VM] due to error in finalizeStart, not
retrying

I tried running the script to restart the VMs but this time it failed to
start the console proxy:

[root@virthost1 ~]$ cloudstack-sysvmadm -d 192.168.100.6 -u cloud -p -a

Stopping and starting 1 secondary storage vm(s)...
Done stopping and starting secondary storage vm(s)

Stopping and starting 1 console proxy vm(s)...
ERROR: Failed to start console proxy vm with id 2

Done stopping and starting console proxy vm(s) .

Stopping and starting 0 running routing vm(s)...

Is there a way to wipe the system VMs out and start over?


Re: SSVM Public IP not pinging

2014-02-23 Thread Ian Young
What about the SSVM's link local address? Can you ping that? It should
begin with 169.254.
On Feb 23, 2014 9:41 PM, "Tejas Gadaria"  wrote:

> I have created basic zone, with no security group on CS 4.0.2 and
> hypervisor is xenserver. SSVM and CPVM is running state but I am not able
> to ping SSVM Public IP. it says " Destination Host Unreachable" .  So I am
> not able to download CentOS template and and hence not able to create any
> guest vm.
>
> need help.
>
> Regards,
> Tejas
>


Re: Management node is detected inactive by timestamp but is pingable

2014-02-21 Thread Ian Young
I found this article, which showed me how to clear the never ending
expungement:
http://support.citrix.com/article/CTX139482

After clearing it, I was able to launch a new instance with the same name.
 However, the "inactive management node" notices continue to fill the logs.


On Fri, Feb 21, 2014 at 10:18 AM, Ian Young  wrote:

> All the records in the mshost table have a NULL value in the 'removed'
> column (and there are more of them now since I've restarted the service a
> few times).  These should have a timestamp instead, right?  If so, what
> should I do about this?
> http://pastebin.com/Vz3XnBqx
>
> My expunge.delay and expunge.internal values are set to 24 hours but this
> instance has been in that state for over 48 hours so far.  This is the
> record in vm_instance for the problematic instance:
> http://pastebin.com/kZVDy1My
>
> As you can see, the 'removed' value is NULL.  If I set it to a timestamp,
> will it go away or are there other references to it in the database?
>  CloudStack won't let me create another instance with the same name until
> this one has been completely removed, and this particular name (web01) is a
> rather useful one for my purposes).
>
>
> On Fri, Feb 21, 2014 at 1:36 AM, Geoff Higginbottom <
> geoff.higginbot...@shapeblue.com> wrote:
>
>> Ian,
>>
>> When you delete a VM, it eventually gets expunged (deleted)  the time
>> this actually takes is controlled by the global settings ' expunge.delay'
>> and 'expunge internal'
>>
>> Once the VM has been expunged its state in the DB table vm_instance will
>> be 'expunging' and there will be a date/time in the 'removed' column (there
>> it is again!)
>>
>> Regards
>>
>> Geoff Higginbottom
>>
>> D: +44 20 3603 0542 | S: +44 20 3603 0540 | M: +447968161581
>>
>> geoff.higginbot...@shapeblue.com
>>
>> -Original Message-
>> From: Ian Young [mailto:iyo...@ratespecial.com]
>> Sent: 20 February 2014 19:37
>> To: users@cloudstack.apache.org
>> Subject: Re: Management node is detected inactive by timestamp but is
>> pingable
>>
>> Restarting cloudstack-agent and cloudstack-management made the inactive
>> management node notice go away but the instance is still stuck in an
>> "expunging" state.  How can I get rid of it?
>>
>>
>> On Thu, Feb 20, 2014 at 10:55 AM, Ian Young 
>> wrote:
>>
>> > I noticed that there are 12 records in the cloud.mshost table, all of
>> > which have an "Up" state.  I only have one management server.  Should
>> > I delete the other 11 records?
>> >
>> >
>> > On Thu, Feb 20, 2014 at 10:20 AM, Ian Young > >wrote:
>> >
>> >> Yesterday I tried to start an existing instance but it failed.  Since
>> >> it was basically a brand new installation, I just decided to destroy
>> >> it and start over.  However, it stayed in an "expunging" state and
>> >> remains so today.  I cannot create new instances now.  The
>> >> management-server.log shows numerous messages like this:
>> >>
>> >> 2014-02-20 10:04:04,442 DEBUG [cloud.cluster.ClusterManagerImpl]
>> >> (Cluster-Heartbeat-1:null) Detected management node left, id:11,
>> >> nodeIP:192.168.100.6
>> >> 2014-02-20 10:04:04,442 INFO  [cloud.cluster.ClusterManagerImpl]
>> >> (Cluster-Heartbeat-1:null) Trying to connect to 192.168.100.6
>> >> 2014-02-20 10:04:04,442 INFO  [cloud.cluster.ClusterManagerImpl]
>> >> (Cluster-Heartbeat-1:null) Management node 11 is detected inactive by
>> >> timestamp but is pingable
>> >>
>> >> The management server and the hypervisor host are the same machine
>> >> (budgetary constraints necessitated this setup) so, obviously, it
>> >> should be able to connect to itself.  What is this timestamp it's
>> >> referring to?  Is it simply a matter of updating this so the
>> >> management server is no longer considered inactive?
>> >>
>> >
>> >
>> Need Enterprise Grade Support for Apache CloudStack?
>> Our CloudStack Infrastructure Support<
>> http://shapeblue.com/cloudstack-infrastructure-support/> offers the best
>> 24/7 SLA for CloudStack Environments.
>>
>> Apache CloudStack Bootcamp training courses
>>
>> **NEW!** CloudStack 4.2.1 training<
>> http://shapeblue.com/cloudstack-training/>
>> 18th-19th February 2014, Brazil. Classroom<
>> http://shapeblue.com

Re: Management node is detected inactive by timestamp but is pingable

2014-02-21 Thread Ian Young
All the records in the mshost table have a NULL value in the 'removed'
column (and there are more of them now since I've restarted the service a
few times).  These should have a timestamp instead, right?  If so, what
should I do about this?
http://pastebin.com/Vz3XnBqx

My expunge.delay and expunge.internal values are set to 24 hours but this
instance has been in that state for over 48 hours so far.  This is the
record in vm_instance for the problematic instance:
http://pastebin.com/kZVDy1My

As you can see, the 'removed' value is NULL.  If I set it to a timestamp,
will it go away or are there other references to it in the database?
 CloudStack won't let me create another instance with the same name until
this one has been completely removed, and this particular name (web01) is a
rather useful one for my purposes).


On Fri, Feb 21, 2014 at 1:36 AM, Geoff Higginbottom <
geoff.higginbot...@shapeblue.com> wrote:

> Ian,
>
> When you delete a VM, it eventually gets expunged (deleted)  the time this
> actually takes is controlled by the global settings ' expunge.delay' and
> 'expunge internal'
>
> Once the VM has been expunged its state in the DB table vm_instance will
> be 'expunging' and there will be a date/time in the 'removed' column (there
> it is again!)
>
> Regards
>
> Geoff Higginbottom
>
> D: +44 20 3603 0542 | S: +44 20 3603 0540 | M: +447968161581
>
> geoff.higginbot...@shapeblue.com
>
> -Original Message-
> From: Ian Young [mailto:iyo...@ratespecial.com]
> Sent: 20 February 2014 19:37
> To: users@cloudstack.apache.org
> Subject: Re: Management node is detected inactive by timestamp but is
> pingable
>
> Restarting cloudstack-agent and cloudstack-management made the inactive
> management node notice go away but the instance is still stuck in an
> "expunging" state.  How can I get rid of it?
>
>
> On Thu, Feb 20, 2014 at 10:55 AM, Ian Young 
> wrote:
>
> > I noticed that there are 12 records in the cloud.mshost table, all of
> > which have an "Up" state.  I only have one management server.  Should
> > I delete the other 11 records?
> >
> >
> > On Thu, Feb 20, 2014 at 10:20 AM, Ian Young  >wrote:
> >
> >> Yesterday I tried to start an existing instance but it failed.  Since
> >> it was basically a brand new installation, I just decided to destroy
> >> it and start over.  However, it stayed in an "expunging" state and
> >> remains so today.  I cannot create new instances now.  The
> >> management-server.log shows numerous messages like this:
> >>
> >> 2014-02-20 10:04:04,442 DEBUG [cloud.cluster.ClusterManagerImpl]
> >> (Cluster-Heartbeat-1:null) Detected management node left, id:11,
> >> nodeIP:192.168.100.6
> >> 2014-02-20 10:04:04,442 INFO  [cloud.cluster.ClusterManagerImpl]
> >> (Cluster-Heartbeat-1:null) Trying to connect to 192.168.100.6
> >> 2014-02-20 10:04:04,442 INFO  [cloud.cluster.ClusterManagerImpl]
> >> (Cluster-Heartbeat-1:null) Management node 11 is detected inactive by
> >> timestamp but is pingable
> >>
> >> The management server and the hypervisor host are the same machine
> >> (budgetary constraints necessitated this setup) so, obviously, it
> >> should be able to connect to itself.  What is this timestamp it's
> >> referring to?  Is it simply a matter of updating this so the
> >> management server is no longer considered inactive?
> >>
> >
> >
> Need Enterprise Grade Support for Apache CloudStack?
> Our CloudStack Infrastructure Support<
> http://shapeblue.com/cloudstack-infrastructure-support/> offers the best
> 24/7 SLA for CloudStack Environments.
>
> Apache CloudStack Bootcamp training courses
>
> **NEW!** CloudStack 4.2.1 training<
> http://shapeblue.com/cloudstack-training/>
> 18th-19th February 2014, Brazil. Classroom<
> http://shapeblue.com/cloudstack-training/>
> 17th-23rd March 2014, Region A. Instructor led, On-line<
> http://shapeblue.com/cloudstack-training/>
> 24th-28th March 2014, Region B. Instructor led, On-line<
> http://shapeblue.com/cloudstack-training/>
> 16th-20th June 2014, Region A. Instructor led, On-line<
> http://shapeblue.com/cloudstack-training/>
> 23rd-27th June 2014, Region B. Instructor led, On-line<
> http://shapeblue.com/cloudstack-training/>
>
> This email and any attachments to it may be confidential and are intended
> solely for the use of the individual to whom it is addressed. Any views or
> opinions expressed are solely those of the author and do not necessarily
> represent

Re: Management node is detected inactive by timestamp but is pingable

2014-02-20 Thread Ian Young
Restarting cloudstack-agent and cloudstack-management made the inactive
management node notice go away but the instance is still stuck in an
"expunging" state.  How can I get rid of it?


On Thu, Feb 20, 2014 at 10:55 AM, Ian Young  wrote:

> I noticed that there are 12 records in the cloud.mshost table, all of
> which have an "Up" state.  I only have one management server.  Should I
> delete the other 11 records?
>
>
> On Thu, Feb 20, 2014 at 10:20 AM, Ian Young wrote:
>
>> Yesterday I tried to start an existing instance but it failed.  Since it
>> was basically a brand new installation, I just decided to destroy it and
>> start over.  However, it stayed in an "expunging" state and remains so
>> today.  I cannot create new instances now.  The management-server.log shows
>> numerous messages like this:
>>
>> 2014-02-20 10:04:04,442 DEBUG [cloud.cluster.ClusterManagerImpl]
>> (Cluster-Heartbeat-1:null) Detected management node left, id:11,
>> nodeIP:192.168.100.6
>> 2014-02-20 10:04:04,442 INFO  [cloud.cluster.ClusterManagerImpl]
>> (Cluster-Heartbeat-1:null) Trying to connect to 192.168.100.6
>> 2014-02-20 10:04:04,442 INFO  [cloud.cluster.ClusterManagerImpl]
>> (Cluster-Heartbeat-1:null) Management node 11 is detected inactive by
>> timestamp but is pingable
>>
>> The management server and the hypervisor host are the same machine
>> (budgetary constraints necessitated this setup) so, obviously, it should be
>> able to connect to itself.  What is this timestamp it's referring to?  Is
>> it simply a matter of updating this so the management server is no longer
>> considered inactive?
>>
>
>


Re: Management node is detected inactive by timestamp but is pingable

2014-02-20 Thread Ian Young
I noticed that there are 12 records in the cloud.mshost table, all of which
have an "Up" state.  I only have one management server.  Should I delete
the other 11 records?


On Thu, Feb 20, 2014 at 10:20 AM, Ian Young  wrote:

> Yesterday I tried to start an existing instance but it failed.  Since it
> was basically a brand new installation, I just decided to destroy it and
> start over.  However, it stayed in an "expunging" state and remains so
> today.  I cannot create new instances now.  The management-server.log shows
> numerous messages like this:
>
> 2014-02-20 10:04:04,442 DEBUG [cloud.cluster.ClusterManagerImpl]
> (Cluster-Heartbeat-1:null) Detected management node left, id:11,
> nodeIP:192.168.100.6
> 2014-02-20 10:04:04,442 INFO  [cloud.cluster.ClusterManagerImpl]
> (Cluster-Heartbeat-1:null) Trying to connect to 192.168.100.6
> 2014-02-20 10:04:04,442 INFO  [cloud.cluster.ClusterManagerImpl]
> (Cluster-Heartbeat-1:null) Management node 11 is detected inactive by
> timestamp but is pingable
>
> The management server and the hypervisor host are the same machine
> (budgetary constraints necessitated this setup) so, obviously, it should be
> able to connect to itself.  What is this timestamp it's referring to?  Is
> it simply a matter of updating this so the management server is no longer
> considered inactive?
>


Management node is detected inactive by timestamp but is pingable

2014-02-20 Thread Ian Young
Yesterday I tried to start an existing instance but it failed.  Since it
was basically a brand new installation, I just decided to destroy it and
start over.  However, it stayed in an "expunging" state and remains so
today.  I cannot create new instances now.  The management-server.log shows
numerous messages like this:

2014-02-20 10:04:04,442 DEBUG [cloud.cluster.ClusterManagerImpl]
(Cluster-Heartbeat-1:null) Detected management node left, id:11,
nodeIP:192.168.100.6
2014-02-20 10:04:04,442 INFO  [cloud.cluster.ClusterManagerImpl]
(Cluster-Heartbeat-1:null) Trying to connect to 192.168.100.6
2014-02-20 10:04:04,442 INFO  [cloud.cluster.ClusterManagerImpl]
(Cluster-Heartbeat-1:null) Management node 11 is detected inactive by
timestamp but is pingable

The management server and the hypervisor host are the same machine
(budgetary constraints necessitated this setup) so, obviously, it should be
able to connect to itself.  What is this timestamp it's referring to?  Is
it simply a matter of updating this so the management server is no longer
considered inactive?


Re: Change of guest IP address

2014-02-19 Thread Ian Young
> > On 19-Dec-2013, at 3:58 PM, Andrei Mikhailovsky  wrote:
> >
> > Do you know if there is an easier way? Like via the api calls or the 
cloudmonkey command? Or is it currently
> the only way?
> >
> > - Original Message -
> > From: "Jayapal Reddy Uradi"  To:
> " 
> "  Sent: Thursday, 19 December, 2013
> 9:25:05 AM
> > Subject: Re: Change of guest IP address
> >
> > Hi,
> >
> > If your VM is in isolated network please do the following
> >
> > 1. edit the nics table ip4_address column for your instance_id to 
new ip.
> > 2. login to the router corresponds to the network and replace old ip 
with new ip in below files.
> >
> > a.  /var/lib/misc/dnsmasq.leases
> >   b.  /etc/dhcphosts.txt
> > 3. restart the dnsmasq in router (service dnsmasq restart) 4. Reboot
> > the VM or restart the network service in Vm so that VM gets the new 
ip from the dhcp.
> >
> > Thanks,
> > Jayapal

I put Jayapal's solution into a script for convenience:

http://pastebin.com/7yJtjNQX

Just edit the first group of variables according to your needs and run it 
like this:

set-vm-ip.sh old-address new-address