[Yahoo-eng-team] [Bug 1633955] [NEW] live migrate rollback disconnect other's volume
Public bug reported: I encountered this bug in my daily test. I found when volume initialize connection failed at dest host, the rollback process will misdisconnect other's volume on dest host. my test step is as follows: 1) create 2 Compute node (host#1 and host#2) 2) create 1 VM on host#1 with volume vol01(vm01) 3) live-migrate vm01 from host#1 to host#2 4) vol01 initialize connection failed on host#2 5) live-migrate rollback and disconnect volume on host#2 6) some volume on host#2 was disconnected by mistake The issue is that in rollback process, nova disconnect volume from the block_device_mapping table, which was supposed to be update on dest host host#2 when volume initialize connection succeed. In this bug, the volume initialize connection failed at dest host host#2, and the record in block_device_mapping table was not updated, remaining the origin one which created on source host host#1, the difference between records of dest and source host may be the lun-id mapped on host, that's the point why other volume was disconnected by mistake on host#2. ** Affects: nova Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1633955 Title: live migrate rollback disconnect other's volume Status in OpenStack Compute (nova): New Bug description: I encountered this bug in my daily test. I found when volume initialize connection failed at dest host, the rollback process will misdisconnect other's volume on dest host. my test step is as follows: 1) create 2 Compute node (host#1 and host#2) 2) create 1 VM on host#1 with volume vol01(vm01) 3) live-migrate vm01 from host#1 to host#2 4) vol01 initialize connection failed on host#2 5) live-migrate rollback and disconnect volume on host#2 6) some volume on host#2 was disconnected by mistake The issue is that in rollback process, nova disconnect volume from the block_device_mapping table, which was supposed to be update on dest host host#2 when volume initialize connection succeed. In this bug, the volume initialize connection failed at dest host host#2, and the record in block_device_mapping table was not updated, remaining the origin one which created on source host host#1, the difference between records of dest and source host may be the lun-id mapped on host, that's the point why other volume was disconnected by mistake on host#2. To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1633955/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1583419] Re: Make dict.keys() PY3 compatible
** Also affects: neutron Importance: Undecided Status: New ** Changed in: neutron Assignee: (unassigned) => Bin Zhou (binzhou) -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1583419 Title: Make dict.keys() PY3 compatible Status in Cinder: In Progress Status in neutron: New Status in OpenStack Compute (nova): New Status in python-cinderclient: Fix Released Status in python-manilaclient: In Progress Status in python-troveclient: In Progress Status in Rally: New Bug description: In PY3, dict.keys() will return a view of list but not a list anymore, i.e. $ python3.4 Python 3.4.3 (default, Mar 31 2016, 20:42:37) >>> body={"11":"22"} >>> body[body.keys()[0]] Traceback (most recent call last): File "", line 1, in TypeError: 'dict_keys' object does not support indexing so for py3 compatible we should change it as follows: >>> body[list(body.keys())[0]] '22' To manage notifications about this bug go to: https://bugs.launchpad.net/cinder/+bug/1583419/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1472900] Re: instance boot from image(creates a new volume) deploy failed when volume rescheduling to other backends
Dear jichenjc : I've done it, thanks. ** No longer affects: nova -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1472900 Title: instance boot from image(creates a new volume) deploy failed when volume rescheduling to other backends Status in Cinder: New Bug description: This bug happens in Icehouse and Kilo version of openstack. I launched instance on web by boot from image(creates a new volume), it failed and raise Invalid volume, and I check cinder list, found the volume is rescheduled and created success. I reviewed the code of cinder volume, found that when volume create failed on one backends, the volume create workflow will revert, which will set the volume status creating and reschedule, and then set the volume status error, volume rescheduled to other backends and then set status downloading, available. In the process of launching instances, nova wait the volume status in function _await_block_device_map_created, it returned when volume status in available and error, when volume rescheduling happend, it will return with volume in error state, and then raise Invalid volume in function check_attach when volume attach. it suggests that when volume is rescheduling, volume status will be set to rescheduling, without setting to error state in wokrflow revert, which make the volume status precise to other components. To manage notifications about this bug go to: https://bugs.launchpad.net/cinder/+bug/1472900/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1472900] Re: instance boot from image(creates a new volume) deploy failed when volume rescheduling to other backends
** Also affects: cinder Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1472900 Title: instance boot from image(creates a new volume) deploy failed when volume rescheduling to other backends Status in Cinder: New Status in OpenStack Compute (nova): New Bug description: This bug happens in Icehouse and Kilo version of openstack. I launched instance on web by boot from image(creates a new volume), it failed and raise Invalid volume, and I check cinder list, found the volume is rescheduled and created success. I reviewed the code of cinder volume, found that when volume create failed on one backends, the volume create workflow will revert, which will set the volume status creating and reschedule, and then set the volume status error, volume rescheduled to other backends and then set status downloading, available. In the process of launching instances, nova wait the volume status in function _await_block_device_map_created, it returned when volume status in available and error, when volume rescheduling happend, it will return with volume in error state, and then raise Invalid volume in function check_attach when volume attach. it suggests that when volume is rescheduling, volume status will be set to rescheduling, without setting to error state in wokrflow revert, which make the volume status precise to other components. To manage notifications about this bug go to: https://bugs.launchpad.net/cinder/+bug/1472900/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1472900] [NEW] instance boot from image(creates a new volume) deploy failed when volume rescheduling to other backends
Public bug reported: This bug happens in Icehouse and Kilo version of openstack. I launched instance on web by boot from image(creates a new volume), it failed and raise Invalid volume, and I check cinder list, found the volume is rescheduled and created success. I reviewed the code of cinder volume, found that when volume create failed on one backends, the volume create workflow will revert, which will set the volume status creating and reschedule, and then set the volume status error, volume rescheduled to other backends and then set status downloading, available. In the process of launching instances, nova wait the volume status in function _await_block_device_map_created, it returned when volume status in available and error, when volume rescheduling happend, it will return with volume in error state, and then raise Invalid volume in function check_attach when volume attach. it suggests that when volume is rescheduling, volume status will be set to rescheduling, without setting to error state in wokrflow revert, which make the volume status precise to other components. ** Affects: nova Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1472900 Title: instance boot from image(creates a new volume) deploy failed when volume rescheduling to other backends Status in OpenStack Compute (Nova): New Bug description: This bug happens in Icehouse and Kilo version of openstack. I launched instance on web by boot from image(creates a new volume), it failed and raise Invalid volume, and I check cinder list, found the volume is rescheduled and created success. I reviewed the code of cinder volume, found that when volume create failed on one backends, the volume create workflow will revert, which will set the volume status creating and reschedule, and then set the volume status error, volume rescheduled to other backends and then set status downloading, available. In the process of launching instances, nova wait the volume status in function _await_block_device_map_created, it returned when volume status in available and error, when volume rescheduling happend, it will return with volume in error state, and then raise Invalid volume in function check_attach when volume attach. it suggests that when volume is rescheduling, volume status will be set to rescheduling, without setting to error state in wokrflow revert, which make the volume status precise to other components. To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1472900/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1415778] [NEW] _local_delete results inconsistent volume state in DB
Public bug reported: when nova-compute service is down, delete instance will call _local_delete in nova-api service, which will delete instance from DB,ternminate connection,detach volume and destroy bdm. However,we set connector = {'ip': '127.0.0.1', 'initiator': 'iqn.fake'} while call ternminate connection, which result an exception, leading the volume status still in used, attached to the instance, but the instance and bdm are deleted in nova db. all of this make DB inconsistent state, bdm is deleted in nova, but volume still in use from cinder. Because the nova compute service is down, we can't get the correct connector of host. If we record the connector in bdm while attaching volume, the connector can be get from bdm when local_delete, which will lead success of ,ternminate connection,detach volume and so on. ** Affects: nova Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1415778 Title: _local_delete results inconsistent volume state in DB Status in OpenStack Compute (Nova): New Bug description: when nova-compute service is down, delete instance will call _local_delete in nova-api service, which will delete instance from DB,ternminate connection,detach volume and destroy bdm. However,we set connector = {'ip': '127.0.0.1', 'initiator': 'iqn.fake'} while call ternminate connection, which result an exception, leading the volume status still in used, attached to the instance, but the instance and bdm are deleted in nova db. all of this make DB inconsistent state, bdm is deleted in nova, but volume still in use from cinder. Because the nova compute service is down, we can't get the correct connector of host. If we record the connector in bdm while attaching volume, the connector can be get from bdm when local_delete, which will lead success of ,ternminate connection,detach volume and so on. To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1415778/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp