Actually it's that your patch is for the 'forced' stop. Still doesn't apply to 4.1, but the code I'm looking at for the moment is the non-forced stop, which returns a boolean.
On Thu, Apr 25, 2013 at 10:05 PM, Marcus Sorensen <[email protected]>wrote: > Uh, nevermind, of course I can't return null there because it requires a > boolean in 4.1. I'll play with this a bit and get back with you. > > > On Thu, Apr 25, 2013 at 10:01 PM, Marcus Sorensen <[email protected]>wrote: > >> Thanks, Edison. That doesn't apply to 4.1, but it's basically the same as >> the first hunk of my patch above, just that it returns null instead of >> throwing an exception. I'm going to test that now. I didn't see why that >> CloudRuntimeException in the first hunk would be squelched, but the finally >> clause makes sense in the second hunk. >> >> >> On Thu, Apr 25, 2013 at 6:08 PM, Edison Su <[email protected]> wrote: >> >>> This code will do the magic: >>> } finally { >>> if (!stopped) { >>> if (!forced) { >>> s_logger.warn("Unable to stop vm " + vm); >>> try { >>> stateTransitTo(vm, Event.OperationFailed, >>> vm.getHostId()); >>> } catch (NoTransitionException e) { >>> s_logger.warn("Unable to transition the state " >>> + vm); >>> } >>> return false; >>> } else { >>> s_logger.warn("Unable to actually stop " + vm + " >>> but continue with release because it's a force stop"); >>> vmGuru.finalizeStop(profile, answer); >>> } >>> } >>> } >>> >>> >>> Basically, it means, if stop failed and it's not force stop, then mark >>> the vm as running. >>> I think the logic here sounds correct, as cloudstack can't delete the >>> vm(due to the connection between mgt server and kvm agent is broken), then >>> all the operations on the VM should fail, and the VM status shouldn't be >>> changed. >>> >>> But the problem is, the stop api call should fail, as stop vm actually >>> failed. >>> Could you try the following patch: >>> >>> diff --git a/server/src/com/cloud/vm/UserVmManagerImpl.java >>> b/server/src/com/cloud/vm/UserVmManagerImpl.java >>> index 7bf04ec..ff20c54 100755 >>> --- a/server/src/com/cloud/vm/UserVmManagerImpl.java >>> +++ b/server/src/com/cloud/vm/UserVmManagerImpl.java >>> @@ -3684,10 +3684,15 @@ public class UserVmManagerImpl extends >>> ManagerBase implements UserVmManager, Use >>> } >>> >>> UserVO user = _userDao.findById(userId); >>> - >>> + boolean status = false; >>> try { >>> VirtualMachineEntity vmEntity = >>> _orchSrvc.getVirtualMachine(vm.getUuid()); >>> - vmEntity.stop(new Long(userId).toString()); >>> + status = vmEntity.stop(new Long(userId).toString()); >>> + if (status) { >>> + return _vmDao.findById(vmId); >>> + } else { >>> + return null; >>> + } >>> } catch (ResourceUnavailableException e) { >>> throw new CloudRuntimeException( >>> "Unable to contact the agent to stop the virtual >>> machine " >>> @@ -3698,7 +3703,7 @@ public class UserVmManagerImpl extends ManagerBase >>> implements UserVmManager, Use >>> + vm, e); >>> } >>> >>> - return _vmDao.findById(vmId); >>> + >>> } >>> >>> @Override >>> >>> >>> > -----Original Message----- >>> > From: Marcus Sorensen [mailto:[email protected]] >>> > Sent: Thursday, April 25, 2013 4:46 PM >>> > To: [email protected] >>> > Subject: Re: [ACS41] new critical bug >>> > >>> > I tried a few things, including throwing a CloudRuntimeException in >>> several >>> > places where I thought it made sense, such as an empty >>> > AgentUnavailableException catch block, but it doesn't seem to do >>> anything at >>> > all, I don't see it in the logs, even though I do see the debug code >>> that I >>> > placed next to it. So I give up for now :-) >>> > >>> > diff --git a/server/src/com/cloud/vm/UserVmManagerImpl.java >>> > b/server/src/com/cloud/vm/UserVmManagerImpl.java >>> > index 7bf04ec..0373690 100755 >>> > --- a/server/src/com/cloud/vm/UserVmManagerImpl.java >>> > +++ b/server/src/com/cloud/vm/UserVmManagerImpl.java >>> > @@ -685,7 +685,8 @@ public class UserVmManagerImpl extends >>> > ManagerBase implements UserVmManager, Use >>> > if (status) { >>> > return status; >>> > } else { >>> > - return status; >>> > + throw new CloudRuntimeException( >>> > + "Unable to stop the virtual machine" + vm ); >>> > } >>> > } >>> > >>> > diff --git a/server/src/com/cloud/vm/VirtualMachineManagerImpl.java >>> > b/server/src/com/cloud/vm/VirtualMachineManagerImpl. >>> > index 2c2986f..e320ff1 100755 >>> > --- a/server/src/com/cloud/vm/VirtualMachineManagerImpl.java >>> > +++ b/server/src/com/cloud/vm/VirtualMachineManagerImpl.java >>> > @@ -1083,7 +1083,11 @@ public class VirtualMachineManagerImpl extends >>> > ManagerBase implements VirtualMac >>> > vmGuru.finalizeStop(profile, answer); >>> > >>> > } catch (AgentUnavailableException e) { >>> > + s_logger.error("Unable to stop vm, agent unavailable: " + >>> > e.toString()); >>> > + throw new CloudRuntimeException("Unable to stop vm, agent >>> > unavailable"); >>> > } catch (OperationTimedoutException e) { >>> > + s_logger.error("Unable to stop vm, operation timed out: " >>> + >>> > e.toString()); >>> > + throw new CloudRuntimeException("Unable to stop vm, >>> > + operation >>> > timed out"); >>> > } finally { >>> > if (!stopped) { >>> > if (!forced) { >>> > >>> > >>> > On Thu, Apr 25, 2013 at 2:06 PM, Chip Childers >>> > <[email protected]>wrote: >>> > >>> > > On Thu, Apr 25, 2013 at 02:01:56PM -0600, Marcus Sorensen wrote: >>> > > > I didn't mark this one as a blocker, but it's still pretty bad for >>> > > someone >>> > > > managing VMs in an automated fashion. Trying to stop a VM when the >>> > > > KVM agent is disconnected reports success. >>> > > > >>> > > > https://issues.apache.org/jira/browse/CLOUDSTACK-2195 >>> > > >>> > > I'll pull a fix in, if we have one ready before the final blockers. >>> > > Otherwise I'd pull it into a 4.1.1 release. >>> > > >>> >> >> >
