I have fixed it here https://review.openstack.org/#/c/11925/
2012/8/25 Sam Su <susltd...@gmail.com>: > Hi, > > I also reported this bug: > https://bugs.launchpad.net/nova/+bug/1040255 > > If someone can combine you guys solution and get a perfect way to fix this > bug, that will be great. > > BRs, > Sam > > > On Thu, Aug 23, 2012 at 9:27 PM, heut2008 <heut2...@gmail.com> wrote: >> >> this bug has been filed here https://bugs.launchpad.net/nova/+bug/1040537 >> >> 2012/8/24 Vishvananda Ishaya <vishvana...@gmail.com>: >> > +1 to this. Evan, can you report a bug (if one hasn't been reported yet) >> > and >> > propose the fix? Or else I can find someone else to propose it. >> > >> > Vish >> > >> > On Aug 23, 2012, at 1:38 PM, Evan Callicoat <diop...@gmail.com> wrote: >> > >> > Hello all! >> > >> > I'm the original author of the hairpin patch, and things have changed a >> > little bit in Essex and Folsom from the original Diablo target. I >> > believe I >> > can shed some light on what should be done here to solve the issue in >> > either >> > case. >> > >> > --- >> > For Essex (stable/essex), in nova/virt/libvirt/connection.py: >> > --- >> > >> > Currently _enable_hairpin() is only being called from spawn(). However, >> > spawn() is not the only place that vifs (veth#) get added to a bridge >> > (which >> > is when we need to enable hairpin_mode on them). The more relevant >> > function >> > is _create_new_domain(), which is called from spawn() and other places. >> > Without changing the information that gets passed to >> > _create_new_domain() >> > (which is just 'xml' from to_xml()), we can easily rewrite the first 2 >> > lines >> > in _enable_hairpin(), as follows: >> > >> > def _enable_hairpin(self, xml): >> > interfaces = self.get_interfaces(xml['name']) >> > >> > Then, we can move the self._enable_hairpin(instance) call from spawn() >> > up >> > into _create_new_domain(), and pass it xml as follows: >> > >> > [...] >> > self._enable_hairpin(xml) >> > return domain >> > >> > This will run the hairpin code every time a domain gets created, which >> > is >> > also when the domain's vif(s) gets inserted into the bridge with the >> > default >> > of hairpin_mode=0. >> > >> > --- >> > For Folsom (trunk), in nova/virt/libvirt/driver.py: >> > --- >> > >> > There've been a lot more changes made here, but the same strategy as >> > above >> > should work. Here, _create_new_domain() has been split into >> > _create_domain() >> > and _create_domain_and_network(), and _enable_hairpin() was moved from >> > spawn() to _create_domain_and_network(), which seems like it'd be the >> > right >> > thing to do, but doesn't quite cover all of the cases of vif >> > reinsertion, >> > since _create_domain() is the only function which actually creates the >> > domain (_create_domain_and_network() just calls it after doing some >> > pre-work). The solution here is likewise fairly simple; make the same 2 >> > changes to _enable_hairpin(): >> > >> > def _enable_hairpin(self, xml): >> > interfaces = self.get_interfaces(xml['name']) >> > >> > And move it from _create_domain_and_network() to _create_domain(), like >> > before: >> > >> > [...] >> > self._enable_hairpin(xml) >> > return domain >> > >> > I haven't yet tested this on my Essex clusters and I don't have a Folsom >> > cluster handy at present, but the change is simple and makes sense. >> > Looking >> > at to_xml() and _prepare_xml_info(), it appears that the 'xml' variable >> > _create_[new_]domain() gets is just a python dictionary, and xml['name'] >> > = >> > instance['name'], exactly what _enable_hairpin() was using the >> > 'instance' >> > variable for previously. >> > >> > Let me know if this works, or doesn't work, or doesn't make sense, or if >> > you >> > need an address to send gifts, etc. Hope it's solved! >> > >> > -Evan >> > >> > On Thu, Aug 23, 2012 at 11:20 AM, Sam Su <susltd...@gmail.com> wrote: >> >> >> >> Hi Oleg, >> >> >> >> Thank you for your investigation. Good lucky! >> >> >> >> Can you let me know if find how to fix the bug? >> >> >> >> Thanks, >> >> Sam >> >> >> >> On Wed, Aug 22, 2012 at 12:50 PM, Oleg Gelbukh <ogelb...@mirantis.com> >> >> wrote: >> >>> >> >>> Hello, >> >>> >> >>> Is it possible that, during snapshotting, libvirt just tears down >> >>> virtual >> >>> interface at some point, and then re-creates it, with hairpin_mode >> >>> disabled >> >>> again? >> >>> This bugfix [https://bugs.launchpad.net/nova/+bug/933640] implies that >> >>> fix works on spawn of instance. This means that upon resume after >> >>> snapshot, >> >>> hairpin is not restored. May be if we insert the _enable_hairpin() >> >>> call in >> >>> snapshot procedure, it helps. >> >>> We're currently investigating this issue in one of our environments, >> >>> hope >> >>> to come up with answer by tomorrow. >> >>> >> >>> -- >> >>> Best regards, >> >>> Oleg >> >>> >> >>> On Wed, Aug 22, 2012 at 11:29 PM, Sam Su <susltd...@gmail.com> wrote: >> >>>> >> >>>> My friend has found a way to enable ping itself, when this problem >> >>>> happened. But not found why this happen. >> >>>> sudo echo "1" > >> >>>> /sys/class/net/br1000/brif/<virtual-interface-name>/hairpin_mode >> >>>> >> >>>> I file a ticket to report this problem: >> >>>> https://bugs.launchpad.net/nova/+bug/1040255 >> >>>> >> >>>> hopefully someone can find why this happen and solve it. >> >>>> >> >>>> Thanks, >> >>>> Sam >> >>>> >> >>>> >> >>>> On Fri, Jul 20, 2012 at 3:50 PM, Gabriel Hurley >> >>>> <gabriel.hur...@nebula.com> wrote: >> >>>>> >> >>>>> I ran into some similar issues with the _enable_hairpin() call. The >> >>>>> call is allowed to fail silently and (in my case) was failing. I >> >>>>> couldn’t >> >>>>> for the life of me figure out why, though, and since I’m really not >> >>>>> a >> >>>>> networking person I didn’t trace it along too far. >> >>>>> >> >>>>> >> >>>>> >> >>>>> Just thought I’d share my similar pain. >> >>>>> >> >>>>> >> >>>>> >> >>>>> - Gabriel >> >>>>> >> >>>>> >> >>>>> >> >>>>> From: >> >>>>> openstack-bounces+gabriel.hurley=nebula....@lists.launchpad.net >> >>>>> >> >>>>> [mailto:openstack-bounces+gabriel.hurley=nebula....@lists.launchpad.net] >> >>>>> On >> >>>>> Behalf Of Sam Su >> >>>>> Sent: Thursday, July 19, 2012 11:50 AM >> >>>>> To: Brian Haley >> >>>>> Cc: openstack >> >>>>> Subject: Re: [Openstack] VM can't ping self floating IP after a >> >>>>> snapshot is taken >> >>>>> >> >>>>> >> >>>>> >> >>>>> Thank you for your support. >> >>>>> >> >>>>> >> >>>>> >> >>>>> I checked the file nova/virt/libvirt/connection.py, the sentence >> >>>>> self._enable_hairpin(instance) is already added to the function >> >>>>> _hard_reboot(). >> >>>>> >> >>>>> It looks like there are some difference between taking snapshot and >> >>>>> reboot instance. I tried to figure out how to fix this bug but >> >>>>> failed. >> >>>>> >> >>>>> >> >>>>> >> >>>>> It will be much appreciated if anyone can give some hints. >> >>>>> >> >>>>> >> >>>>> >> >>>>> Thanks, >> >>>>> >> >>>>> Sam >> >>>>> >> >>>>> >> >>>>> >> >>>>> On Thu, Jul 19, 2012 at 8:37 AM, Brian Haley <brian.ha...@hp.com> >> >>>>> wrote: >> >>>>> >> >>>>> On 07/17/2012 05:56 PM, Sam Su wrote: >> >>>>> > Hi, >> >>>>> > >> >>>>> > Just This always happens in Essex release. After I take a snapshot >> >>>>> > of >> >>>>> > my VM ( I >> >>>>> > tried Ubuntu 12.04 or CentOS 5.8), VM can't ping its self floating >> >>>>> > IP; before I >> >>>>> > take a snapshot though, VM can ping its self floating IP. >> >>>>> > >> >>>>> > This looks closely related to >> >>>>> > https://bugs.launchpad.net/nova/+bug/933640, but >> >>>>> > still a little different. In 933640, it sounds like VM can't ping >> >>>>> > its >> >>>>> > self >> >>>>> > floating IP regardless whether we take a snapshot or not. >> >>>>> > >> >>>>> > Any suggestion to make an easy fix? And what is the root cause of >> >>>>> > the >> >>>>> > problem? >> >>>>> >> >>>>> It might be because there's a missing _enable_hairpin() call in the >> >>>>> reboot() >> >>>>> function. Try something like this... >> >>>>> >> >>>>> nova/virt/libvirt/connection.py, _hard_reboot(): >> >>>>> >> >>>>> self._create_new_domain(xml) >> >>>>> + self._enable_hairpin(instance) >> >>>>> self.firewall_driver.apply_instance_filter(instance, >> >>>>> network_info) >> >>>>> >> >>>>> At least that's what I remember doing myself recently when testing >> >>>>> after a >> >>>>> reboot, don't know about snapshot. >> >>>>> >> >>>>> Folsom has changed enough that something different would need to be >> >>>>> done there. >> >>>>> >> >>>>> -Brian >> >>>>> >> >>>>> >> >>>> >> >>>> >> >>>> >> >>>> _______________________________________________ >> >>>> Mailing list: https://launchpad.net/~openstack >> >>>> Post to : openstack@lists.launchpad.net >> >>>> Unsubscribe : https://launchpad.net/~openstack >> >>>> More help : https://help.launchpad.net/ListHelp >> >>>> >> >>> >> >> >> >> >> >> _______________________________________________ >> >> Mailing list: https://launchpad.net/~openstack >> >> Post to : openstack@lists.launchpad.net >> >> Unsubscribe : https://launchpad.net/~openstack >> >> More help : https://help.launchpad.net/ListHelp >> >> >> > >> > _______________________________________________ >> > Mailing list: https://launchpad.net/~openstack >> > Post to : openstack@lists.launchpad.net >> > Unsubscribe : https://launchpad.net/~openstack >> > More help : https://help.launchpad.net/ListHelp >> > >> > >> > >> > _______________________________________________ >> > Mailing list: https://launchpad.net/~openstack >> > Post to : openstack@lists.launchpad.net >> > Unsubscribe : https://launchpad.net/~openstack >> > More help : https://help.launchpad.net/ListHelp >> > >> >> _______________________________________________ >> Mailing list: https://launchpad.net/~openstack >> Post to : openstack@lists.launchpad.net >> Unsubscribe : https://launchpad.net/~openstack >> More help : https://help.launchpad.net/ListHelp > > _______________________________________________ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp