Hey Adam, There have been some updates since Liberty to improve handling in the os-brick library that handles the local device management. But with this showing the paths down, I wonder if there's something else going on there between the NetApp box and the Nova compute host.
Could you file a bug to track this? I think you could just copy and paste the content of your original email since it captures a lot of great info. https://bugs.launchpad.net/cinder/+filebug We can tag it with netapp so maybe it will get some attention there. Thanks, Sean On Wed, Aug 23, 2017 at 01:01:24PM -0400, Adam Dibiase wrote: > Greetings, > > I am having an issue with nova starting an instance that is using a root > volume that cinder has extended. More specifically, a volume that has been > extended past the max resize limit of our Netapp filer. I am running > Liberty and upgraded cinder packages to 7.0.3 from 7.0.0 to take advantage > of this functionality. From what I can gather, it uses sub-lun cloning to > get past the hard limit set by Netapp when cloning past 64G (starting from > a 4G volume). > > *Environment*: > > - Release: Liberty > - Filer: Netapp > - Protocol: Fiberchannel > - Multipath: yes > > > > *Steps to reproduce: * > > - Create new instance > - stop instance > - extend the volume by running the following commands: > - cinder reset-state --state available (volume-ID or name) > - cinder extend (volume-ID or name) 100 > - cinder reset-state --state in-use (volume-ID or name) > - start instance with either nova start or nova reboot --hard --same > result > > > I can see that the instance's multipath status is good before the resize... > > *360a98000417643556a2b496d58665473 dm-17 NETAPP ,LUN * > > size=20G features='1 queue_if_no_path' hwhandler='0' wp=rw > > |-+- policy='round-robin 0' prio=-1 status=active > > | |- 6:0:1:5 sdy 65:128 active undef running > > | `- 7:0:0:5 sdz 65:144 active undef running > > `-+- policy='round-robin 0' prio=-1 status=enabled > > |- 6:0:0:5 sdx 65:112 active undef running > > `- 7:0:1:5 sdaa 65:160 active undef running > > > Once the volume is resized, the lun goes to a failed state and it does not > show the new size: > > > *360a98000417643556a2b496d58665473 dm-17 NETAPP ,LUN * > > size=20G features='1 queue_if_no_path' hwhandler='0' wp=rw > > |-+- policy='round-robin 0' prio=-1 status=enabled > > | |- 6:0:1:5 sdy 65:128 failed undef running > > | `- 7:0:0:5 sdz 65:144 failed undef running > > `-+- policy='round-robin 0' prio=-1 status=enabled > > |- 6:0:0:5 sdx 65:112 failed undef running > > `- 7:0:1:5 sdaa 65:160 failed undef running > > > Like I said, this only happens on volumes that have been extended past 64G. > Smaller sizes to not have this issue. I can only assume that the original > lun is getting destroyed after the clone process and that is cause of the > failed state. Why is it not picking up the new one and attaching it to the > compute node? Is there something I am missing? > > Thanks in advance, > > Adam > _______________________________________________ > OpenStack-operators mailing list > OpenStack-operators@lists.openstack.org > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators _______________________________________________ OpenStack-operators mailing list OpenStack-operators@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators