Sorry for the late response!

Ack from my side, this seems good to have in 4.1 and will resolve some weird issues.

Wido

On 03/15/2013 04:31 AM, Marcus Sorensen wrote:
Just wanted to ping on this again. I'm confident enough in it if widow
doesn't get a chance to respond.
On Mar 13, 2013 1:27 PM, "Marcus Sorensen" <shadow...@gmail.com> wrote:

Ok, the fix for CLOUDSTACK-600 is the first patch mentioned plus two
fixes to it. The first additional patch handles upgrades, and the
second one is a minor fix for when we shut down a non-persistent VM
that will avoid printing useless scary messages (we throw an error in
the timeout loop that waits for the vm to shut down when it succeeds,
because the domain can't be found).

I've tested this patch set both in a fresh 4.1 install and a fresh
master install, and both were able to start/stop VMs successfully.
When upgrading an existing 4.1 install (without the patches), the VMs
were undefined on shutdown as they should be.

commit 5dfcd309f10e5bd6a918f7fdff3f44a3dff2374a
Author: Wido den Hollander <w...@widodh.nl>
Date:   Thu Feb 7 22:58:20 2013 +0100

     agent: Do not define domains persistent in libvirt

     We used to define domains persistent in libvirt, which caused XML
definitionsto stay there after a reboot of the hypervisor.

     We however don't do anything with those already defined domains,
actually, we wipe all defined domains when starting the agent.

     Some users however reported that libvirt started these domains
after a reboot before the CloudStack agent was started.

     By starting domains from the XML description and not defining them
we prevent them from ever being stored in libvirt.

commit 8d7d1cd5623a6744a954fb35eeff8408d6381083
Author: Marcus Sorensen <mar...@betterservers.com>
Date:   Wed Mar 13 11:10:56 2013 -0600

     Summary: KVM - undefine persistent VMs on stop

     Detail: A previous patch fixed an issue where we are defining VMs to
persist
     locally on KVM hosts, which can cause issues if the agent isn't
running and
     libvirt decides to start the VM unbeknownst to cloudstack. The
previous patch
     stopped defining VMs as persistent. This patch adds compatibility
for existing
     cloudstack environments, removing the persistent definition on
stop if needed.

     BUG-ID: CLOUDSTACK-600

commit 97d2e3fe7772fa01941295397f9d59d35cf47671
Author: Marcus Sorensen <mar...@betterservers.com>
Date:   Wed Mar 13 12:57:46 2013 -0600

     Summary: KVM - remove harmless message about domain not found on VM
stop

     Detail: When we stop a VM, it's definition is no longer valid.
Therefore, we
     need to catch the exception thrown from libvirt in trying to lookup a
     non-existent domain by UUID while trying to check if it's shut down.

     BUG-ID:CLOUDSTACK-600
     Signed-off-by: Marcus Sorensen <mar...@betterservers.com> 1363201066
-0600


On Tue, Mar 12, 2013 at 11:49 PM, Marcus Sorensen <shadow...@gmail.com>
wrote:
Looks like this patch might need to be adjusted slightly.  We can use
the domain.isPersistent() method to determine if we need to call
domain.undefine() in addition to dm.shutdown and dm.destroy. That will
allow upgrades to clean up existing definitions as VMs are shutdown.
If you don't have a chance to get to that, I'll try to tomorrow.

On Tue, Mar 12, 2013 at 7:23 PM, Marcus Sorensen <shadow...@gmail.com>
wrote:
To clarify, this is a fairly serious bug, and it should probably be
addressed, I'm just wanting clarification that the patch is good to go
as a fix. On my way home I was thinking about how I'm not sure if it
addresses upgrades, i.e. does it still allow existing definitions to
be wiped, or will those be forever stuck. I haven't even looked yet.

Essentially, the bug is that if a KVM host is uncleanly shut down, and
then brought back up without starting the agent, the VMs that were
running on the system will be started by libvirt/startup scripts,
which can be disastrous for shared-storage VMs.  So if a host goes
down without being put into maintenance mode, for whatever reason, and
the agent is either slow to start, the agent fails to start for some
reason, or auto-start is disabled for some reason, you're almost
guaranteed to have two VMs running off of the same shared storage
image.

We don't want to make any VMs permanently known to a host, and libvirt
provides a simple way to define a VM domain without making it
permanent.

On Tue, Mar 12, 2013 at 6:42 PM, Chip Childers
<chip.child...@sungard.com> wrote:
On Tue, Mar 12, 2013 at 06:39:09PM -0600, Marcus Sorensen wrote:
This addresses CLOUDSTACK-600.  Please get buyoff from Wido if
possible, since it's his patch. I've reviewed and tested it, but I
want to make sure I'm not wrong in assuming that this should be in 4.1

commit 5dfcd309f10e5bd6a918f7fdff3f44a3dff2374a
Author: Wido den Hollander <w...@widodh.nl>
Date:   Thu Feb 7 22:58:20 2013 +0100

     agent: Do not define domains persistent in libvirt

     We used to define domains persistent in libvirt, which caused XML
definitions
     to stay there after a reboot of the hypervisor.

     We however don't do anything with those already defined domains,
actually, we wipe
     all defined domains when starting the agent.

     Some users however reported that libvirt started these domains
after a reboot
     before the CloudStack agent was started.

     By starting domains from the XML description and not defining them
we prevent
     them from ever being stored in libvirt.


Wido - I'll let you comment before we apply this.



Reply via email to