Reviewed: https://review.openstack.org/254428 Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=4f2a46987cf705d5dea84e97ef2006342cc5d9c4 Submitter: Jenkins Branch: master
commit 4f2a46987cf705d5dea84e97ef2006342cc5d9c4 Author: Matt Riedemann <mrie...@us.ibm.com> Date: Mon Dec 7 14:49:18 2015 -0800 Make sure bdm.volume_id is set after auto-creating volumes The test_create_ebs_image_and_check_boot test in Tempest does the following: 1. create volume1 from an image 2. boot server1 from volume1 with delete_on_termination=True and wait for the server to be ACTIVE 3. create snapshot from server1 (creates image and volume snapshots) 4. delete server1 5. create server2 from the image snapshot (don't wait for it to be ACTIVE - this auto-creates volume2 from the volume snapshot in cinder and attaches server2 to it) 6. delete server2 (could still be building/attaching volumes in the background) 7. cleanup There is a race when booting server2, which creates and attaches volume2, and deleting server2 before it's active. The volume attach completes and updates the bdm.volume_id in the DB before we get to _shutdown_instance, but after the delete request is in the API. The compute API gets potentially stale BDMs and passes those over RPC to the compute. So we add a check in _shutdown_instance to see if we have potentially stale volume BDMs and refresh that list if so. The instance.uuid locks in build_and_run_instance and terminate_instance create the mutex on the compute host such that the bdm.volume_id should be set in the database after the volume attach and before terminate_instance gets the lock. The bdm.volume_id could still be None in _shutdown_instance if the volume create fails, but we don't have anything to teardown in cinder in that case anyway. In the case of the race bug, deleting the volume snapshot in cinder fails because volume2 was never deleted by nova, so the test fails in teardown. Note that there is still potential for a race here, this does not eliminate it, but should narrow the race window. This also cleans up the logging in attach_block_devices since there may not be a volume_id at that point (depending on bdm.source_type). Closes-Bug: #1464259 Change-Id: Ib60d60a5af35be89ad8afbcf44fcffe0b0ce2876 ** Changed in: nova Status: In Progress => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1464259 Title: Volumes tests fails often with rbd backend Status in Cinder: Triaged Status in OpenStack Compute (nova): Fix Released Status in tempest: Invalid Bug description: http://logs.openstack.org/02/173802/5/check/check-tempest-dsvm-full- ceph/a72aac1/logs/screen-n-api.txt.gz?level=TRACE#_2015-06-11_09_04_19_511 2015-06-11 09:04:19.511 ERROR nova.api.ec2 [req-0ac81d78-2717-4dd2-80e2-d94363b55ac8 EC2VolumesTest-442487008 EC2VolumesTest-1066393631] Unexpected InvalidInput raised: Invalid input received: Invalid volume: Volume still has 1 dependent snapshots. (HTTP 400) (Request-ID: req-4586b5d2-7212-4ddd-af79-43ad8ba7ea58) 2015-06-11 09:04:19.511 ERROR nova.api.ec2 [req-0ac81d78-2717-4dd2-80e2-d94363b55ac8 EC2VolumesTest-442487008 EC2VolumesTest-1066393631] Environment: {"HTTP_AUTHORIZATION": "AWS4-HMAC-SHA256 Credential=a5e9253350ce4a249ddce8b7c1c798c2/20150611/0/127/aws4_request,SignedHeaders=host;x-amz-date,Signature=304830ed947f7fba3143887b08d1e47faa18d4b59782c0992727cb7593f586b4", "SCRIPT_NAME": "", "REQUEST_METHOD": "POST", "HTTP_X_AMZ_DATE": "20150611T090418Z", "PATH_INFO": "/", "SERVER_PROTOCOL": "HTTP/1.0", "CONTENT_LENGTH": "60", "HTTP_USER_AGENT": "Boto/2.38.0 Python/2.7.6 Linux/3.13.0-53-generic", "RAW_PATH_INFO": "/", "REMOTE_ADDR": "127.0.0.1", "wsgi.url_scheme": "http", "SERVER_PORT": "8773", "CONTENT_TYPE": "application/x-www-form-urlencoded; charset=UTF-8", "HTTP_HOST": "127.0.0.1:8773", "SERVER_NAME": "127.0.0.1", "GATEWAY_INTERFACE": "CGI/1.1", "REMOTE_PORT": "45819", "HTTP_ACCEPT_ENCODING": "identity"} http://logstash.openstack.org/#eyJzZWFyY2giOiJtZXNzYWdlOlwiRUMyVm9sdW1lc1Rlc3RcIiBBTkQgbWVzc2FnZTpcIlVuZXhwZWN0ZWQgSW52YWxpZElucHV0IHJhaXNlZDogSW52YWxpZCBpbnB1dCByZWNlaXZlZDogSW52YWxpZCB2b2x1bWU6IFZvbHVtZSBzdGlsbCBoYXMgMSBkZXBlbmRlbnQgc25hcHNob3RzXCIgQU5EIHRhZ3M6XCJzY3JlZW4tbi1hcGkudHh0XCIiLCJmaWVsZHMiOltdLCJvZmZzZXQiOjAsInRpbWVmcmFtZSI6IjYwNDgwMCIsImdyYXBobW9kZSI6ImNvdW50IiwidGltZSI6eyJ1c2VyX2ludGVydmFsIjowfSwic3RhbXAiOjE0MzQwMzAyMTUwODd9 10 hits in 7 days, check and gate, hitting on the ceph and glusterfs jobs. To manage notifications about this bug go to: https://bugs.launchpad.net/cinder/+bug/1464259/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp