Hello,
I picked up this bug from Mandar and have successfully implemented and unit tested the solution! Following are the changes in “revert_memory_snapshot” function of the *“vmopsSnapshot”* script: @echo def revert_memory_snapshot(session, args): retCode = 0 logging.debug("Calling revert_memory_snapshot with " + str(args)) vmName = args['vmName'] snapshotUUID = args['snapshotUUID'] oldVmUuid = args['oldVmUuid'] snapshotMemory = args['snapshotMemory'] hostUUID = args['hostUUID'] try: cmd = '''xe vbd-list vm-uuid=%s | grep 'vdi-uuid' | grep -v 'not in database' | sed -e 's/vdi-uuid ( RO)://g' ''' % oldVmUuid logging.debug("Executing command: " + cmd) p = subprocess.Popen(cmd, stdout=subprocess.PIPE, stderr=subprocess.PIPE, shell=True) stdout, stderr = p.communicate() retCode = p.returncode if retCode == 0: vdiUuids = stdout.split() logging.debug(vdiUuids) else: logging.error("Command: " + cmd + " failed with return code: " + str(retCode) + " and error message: " + stderr) cmd = '''xe vm-param-get param-name=power-state uuid=%s ''' % oldVmUuid logging.debug("Executing command: " + cmd) p = subprocess.Popen(cmd, stdout=subprocess.PIPE, stderr=subprocess.PIPE, shell=True) stdout, stderr = p.communicate() retCode = p.returncode if retCode == 0: vmPowerState = stdout.split()[0] logging.debug("vmPowerState: " + vmPowerState) if vmPowerState != 'halted': cmd1 = '''xe vm-shutdown force=true vm=%s ''' % vmName logging.debug("Executing command: " + cmd1) retCode1 = subprocess.call(cmd1, shell=True) if retCode1 != 0: logging. error("Command: " + cmd1 + " failed with return code: " + str(retCode1)) else: logging. error("Command: " + cmd + " failed with return code: " + str(retCode) + " and error message: " + stderr) cmd = '''xe vm-destroy uuid=%s ''' % oldVmUuid logging.debug("Executing command: " + cmd) retCode = subprocess.call(cmd, shell=True) if retCode != 0: logging. error("Command: " + cmd + " failed with return code: " + str(retCode)) cmd = '''xe snapshot-revert snapshot-uuid=%s ''' % snapshotUUID logging.debug("Executing command: " + cmd) p = subprocess.Popen(cmd, stdout=subprocess.PIPE, stderr=subprocess.PIPE, shell=True) stdout, stderr = p.communicate() retCode = p.returncode if retCode == 0: logging.debug("Command: " + cmd + " executed successfully, stdout = " + stdout) if snapshotMemory == 'true': cmd1 = ''' xe vm-resume vm==%s on=%s ''' % (vmName, hostUUID) logging.debug("Executing command: " + cmd1) retCode1 = subprocess.call(cmd1, shell=True) if retCode1 != 0: logging. error("Command: " + cmd1 + " failed with return code: " + str(retCode1)) for vdiUuid in vdiUuids: cmd2 = '''xe vm-destroy uuid=%s ''' % vdiUuid logging.debug("Executing command: " + cmd2) retCode2 = subprocess.call(cmd2, shell=True) if retCode2 != 0: logging. error("Command: " + cmd2 + " failed with return code: " + str(retCode2)) else: logging. error("Command: " + cmd + " failed with return code: " + str(retCode) + " and error message: " + stderr) except OSError, (errno, strerror): errMsg = "OSError while reverting vm " + vmName + " to snapshot " + snapshotUUID + " with errno: " + str(errno) + " and strerr: " + strerror logging. error(errMsg) raise xs_errors.XenError(errMsg) return str(retCode) Logs: 2014-10-31 16:57:37 DEBUG [root] Executing command: xe vbd-list vm-uuid=91dcb34d-1b96-909f-71c2-9ea3ed0911ec | grep 'vdi-uuid' | grep -v 'not in database' | sed -e 's/vdi-uuid ( RO)://g' 2014-10-31 16:57:37 DEBUG [root] ['0043e816-d74d-4e2c-8078-0e92f093454c', 'aff8dc5b-8db3-4822-ba83-d2357f0c1cad'] 2014-10-31 16:57:37 DEBUG [root] Executing command: xe vm-param-get param-name=power-state uuid=91dcb34d-1b96-909f-71c2-9ea3ed0911ec 2014-10-31 16:57:37 DEBUG [root] vmPowerState: halted 2014-10-31 16:57:37 DEBUG [root] Executing command: xe vm-destroy uuid=91dcb34d-1b96-909f-71c2-9ea3ed0911ec 2014-10-31 16:57:37 DEBUG [root] Executing command: xe snapshot-revert snapshot-uuid=ab6ac37f-3a50-e1e7-23ce-4f19cd929c2c 2014-10-31 16:57:39 ERROR [root] Command: xe snapshot-revert snapshot-uuid=ab6ac37f-3a50-e1e7-23ce-4f19cd929c2c failed with *return code: 1 and error message: Error code: SR_BACKEND_FAILURE_44* I have changed all the calls using “os” to “subprocess”. Request you to please have a look and suggest if any changes are required. Thanks in advance! *Regards,* *Krunal Jain | Senior Engineer – Cloud (Product Engineering)* Email: krunal.j...@sungard.com ▪ kruna...@gmail.com | Mobile: +91-92713-59024 <+91%2092713%2059024> *Sungard Availability Services, India *| www.sungardas. <http://www.sungardas.com/>*in* 2nd Floor, Wing 4, Cluster D, MIDC, Kharadi Knowledge Park, Pune - 411 014 *[image: Logo]* <http://www.sungardas.com/> *[image: cid:image019.png@01CF48EC.6617C7F0]* <http://blog.sungardas.com/> *[image: cid:image020.png@01CF48EC.6617C7F0]* <http://www.youtube.com/user/SunGardAS> *[image: cid:image021.png@01CF48EC.6617C7F0]* <https://plus.google.com/u/0/102459878242108588663/posts> *[image: cid:image022.png@01CF48EC.6617C7F0]* <https://www.facebook.com/SunGardAS> *[image: cid:image023.png@01CF48EC.6617C7F0]* <http://www.linkedin.com/company/sungard-availability-services> *[image: cid:image024.png@01CF48EC.6617C7F0]* <https://twitter.com/SunGardAS> *CONFIDENTIALITY:* This e-mail (including any attachments) may contain confidential, proprietary and privileged information, and unauthorized disclosure or use is prohibited. If you received this e-mail in error, please notify the sender and delete this e-mail from your system. *From:* Mandar Barve [mailto:mandar.ba...@sungardas.com] *Sent:* Monday, October 27, 2014 12:12 PM *To:* Krunal Jain *Subject:* Fwd: CLOUDSTACK-5583: vmopsSnapshot plug-in (XenServer) does not return an error when it should ---------- Forwarded message ---------- From: *Mike Tutkowski* <mike.tutkow...@solidfire.com> Date: Sun, Jul 27, 2014 at 9:48 AM Subject: Re: CLOUDSTACK-5583: vmopsSnapshot plug-in (XenServer) does not return an error when it should To: Mandar Barve <mandar.ba...@sungardas.com> Cc: cloudstack <dev@cloudstack.apache.org> Sorry...I somehow missed this e-mail. Yes, I think trying that approach sounds good. Thanks! On Mon, Jul 7, 2014 at 8:11 AM, Mandar Barve <mandar.ba...@sungardas.com> wrote: I followed the steps you mentioned on the bug using a software iSCSI target in a VM and could reproduce the problem. I do see INSUFFICIENT_SPACE exception being thrown when "xe snapshot-revert" is called by the vmopsSnapshot plug in. When this happens the plugin doesn't throw any exception or return any error. The problem looks like this xe command is called via os.system module by the python plugin. xe is a different program and any error/exception thrown by this won't get propagated to the caller. To fix this os.system can be replaced by subprocess.call with a check for the return code. I tried this and this will return a non zero error code to the management server. It may still not return the child process's exception code. Let me know what you think. Thanks, Mandar On Fri, Mar 14, 2014 at 11:55 AM, Mandar Barve <mandar.ba...@sungard.com> wrote: I tried to reproduce the issue the way you mentioned with few changes. I don't have iSCSI SAN on my setup. I connected a 2 GB disk that I presented as storage tagged NFS primary to CS. I created a 1GB disk offering and then deployed a VM on this new primary. Took a couple of snapshots like you mentioned and when tried to revert to one of them I did see an error in vmops log that said revert_memory_snapshot returned NULL. Exception was thrown with async job status result code 530 and text response as "Failed to revert VM snapshot". I think this exception came later. The vmops snapshot plugin code itself may not have landed into exception handling path. I need to double check this. Is this what you are referring to? Could you attach snippets of SMlog and mops.log when the failure happened to the JIRA? Thanks, Mandar On Tue, Mar 11, 2014 at 3:25 AM, Mike Tutkowski < mike.tutkow...@solidfire.com> wrote: Here is the comment I just added in JIRA for this ticket. Thanks! Hi, Here is how I reproduced it: I created an iSCSI volume on my SAN that is only 2 GB. I created a XenServer SR based on this SAN volume. I created Primary Storage in CloudStack based on this XenServer SR. I created a Disk Offering that was storage tagged to use this Primary Storage. It will lead to the creation of a 1 GB volume when executed and attached to a VM for the first time. I executed the Disk Offering to create a CloudStack volume and attached this volume to a VM. I took two hypervisor snapshots of the VM, then reverted to the first hypervisor snapshot. I looked at the SR that should contain my CloudStack volume and its hypervisor snapshots. I saw two snapshots, but no active VDI. I should see two hypervisor snapshots and an active VDI. Thanks! On Mon, Mar 10, 2014 at 9:27 AM, Mike Tutkowski < mike.tutkow...@solidfire.com> wrote: I did look at it, but haven't had a chance to try to repo. I should be able to try to repo it today. Thanks! On Sun, Mar 9, 2014 at 10:05 PM, Mandar Barve <mandar.ba...@sungard.com> wrote: Hi Mike, Did you get a chance to look at this? Thanks, Mandar On Wed, Mar 5, 2014 at 10:12 AM, Mandar Barve <mandar.ba...@sungard.com> wrote: I tested this with CS 4.3. Thanks, Mandar On Tue, Mar 4, 2014 at 9:09 PM, Mike Tutkowski <mike.tutkow...@solidfire.com> wrote: Hi, Can you tell me what release you tested this with? I noticed the problem while developing on CloudStack 4.3. Thanks! On Tue, Mar 4, 2014 at 3:43 AM, Mandar Barve <mandar.ba...@sungard.com> wrote: Hi, I tried to reproduce the issue but couldn't get this to fail for insufficient space. I then injected an exception trying to list files from a non existent path (added this code in the "try" block). This landed me into the exception handling code. It raised correct exception saying "file not found" which was captured in the management server vmops log file. It was not displayed by the GUI. GUI just reported Error (Are we looking for GUI displaying error code?). The plugin code returns "0" immediately after the line of code that raises exception but I think this applies only for successful execution of the plugin code that reverts the snapshot. If any exception is raised (e.g. in the reported case here insufficient space) then the code should return appropriate error message to the caller as I found. In exception handling path return "0" wouldn't execute. I don't see any problem here. Let me know if I am missing anything. Thanks, Mandar -- *Mike Tutkowski* *Senior CloudStack Developer, SolidFire Inc.* e: mike.tutkow...@solidfire.com o: 303.746.7302 Advancing the way the world uses the cloud <http://solidfire.com/solution/overview/?video=play>*™* -- *Mike Tutkowski* *Senior CloudStack Developer, SolidFire Inc.* e: mike.tutkow...@solidfire.com o: 303.746.7302 Advancing the way the world uses the cloud <http://solidfire.com/solution/overview/?video=play>*™* -- *Mike Tutkowski* *Senior CloudStack Developer, SolidFire Inc.* e: mike.tutkow...@solidfire.com o: 303.746.7302 Advancing the way the world uses the cloud <http://solidfire.com/solution/overview/?video=play>*™* -- *Mike Tutkowski* *Senior CloudStack Developer, SolidFire Inc.* e: mike.tutkow...@solidfire.com o: 303.746.7302 Advancing the way the world uses the cloud <http://solidfire.com/solution/overview/?video=play>*™*