There also another bug you can link/duplicate with #1192381 is https://bugs.launchpad.net/neutron/+bug/1185916. I proposed a fix but it's not the good way. I abandoned it.
Édouard. On Wed, Dec 4, 2013 at 10:43 PM, Carl Baldwin <c...@ecbaldwin.net> wrote: > I have offered up https://review.openstack.org/#/c/60082/ as a > backport to Havana. Interest was expressed in the blueprint for doing > this even before this thread. If there is consensus for this as the > stop-gap then it is there for the merging. However, I do not want to > discourage discussion of other stop-gap solutions like what Maru > proposed in the original post. > > Carl > > On Wed, Dec 4, 2013 at 9:12 AM, Ashok Kumaran <ashokkumara...@gmail.com> > wrote: >> >> >> >> On Wed, Dec 4, 2013 at 8:30 PM, Maru Newby <ma...@redhat.com> wrote: >>> >>> >>> On Dec 4, 2013, at 8:55 AM, Carl Baldwin <c...@ecbaldwin.net> wrote: >>> >>> > Stephen, all, >>> > >>> > I agree that there may be some opportunity to split things out a bit. >>> > However, I'm not sure what the best way will be. I recall that Mark >>> > mentioned breaking out the processes that handle API requests and RPC >>> > from each other at the summit. Anyway, it is something that has been >>> > discussed. >>> > >>> > I actually wanted to point out that the neutron server now has the >>> > ability to run a configurable number of sub-processes to handle a >>> > heavier load. Introduced with this commit: >>> > >>> > https://review.openstack.org/#/c/37131/ >>> > >>> > Set api_workers to something > 1 and restart the server. >>> > >>> > The server can also be run on more than one physical host in >>> > combination with multiple child processes. >>> >>> I completely misunderstood the import of the commit in question. Being >>> able to run the wsgi server(s) out of process is a nice improvement, thank >>> you for making it happen. Has there been any discussion around making the >>> default for api_workers > 0 (at least 1) to ensure that the default >>> configuration separates wsgi and rpc load? This also seems like a great >>> candidate for backporting to havana and maybe even grizzly, although >>> api_workers should probably be defaulted to 0 in those cases. >> >> >> +1 for backporting the api_workers feature to havana as well as Grizzly :) >>> >>> >>> FYI, I re-ran the test that attempted to boot 75 micro VM's simultaneously >>> with api_workers = 2, with mixed results. The increased wsgi throughput >>> resulted in almost half of the boot requests failing with 500 errors due to >>> QueuePool errors (https://bugs.launchpad.net/neutron/+bug/1160442) in >>> Neutron. It also appears that maximizing the number of wsgi requests has >>> the side-effect of increasing the RPC load on the main process, and this >>> means that the problem of dhcp notifications being dropped is little >>> improved. I intend to submit a fix that ensures that notifications are sent >>> regardless of agent status, in any case. >>> >>> >>> m. >>> >>> > >>> > Carl >>> > >>> > On Tue, Dec 3, 2013 at 9:47 AM, Stephen Gran >>> > <stephen.g...@theguardian.com> wrote: >>> >> On 03/12/13 16:08, Maru Newby wrote: >>> >>> >>> >>> I've been investigating a bug that is preventing VM's from receiving >>> >>> IP >>> >>> addresses when a Neutron service is under high load: >>> >>> >>> >>> https://bugs.launchpad.net/neutron/+bug/1192381 >>> >>> >>> >>> High load causes the DHCP agent's status updates to be delayed, >>> >>> causing >>> >>> the Neutron service to assume that the agent is down. This results in >>> >>> the >>> >>> Neutron service not sending notifications of port addition to the DHCP >>> >>> agent. At present, the notifications are simply dropped. A simple >>> >>> fix is >>> >>> to send notifications regardless of agent status. Does anybody have >>> >>> any >>> >>> objections to this stop-gap approach? I'm not clear on the >>> >>> implications of >>> >>> sending notifications to agents that are down, but I'm hoping for a >>> >>> simple >>> >>> fix that can be backported to both havana and grizzly (yes, this bug >>> >>> has >>> >>> been with us that long). >>> >>> >>> >>> Fixing this problem for real, though, will likely be more involved. >>> >>> The >>> >>> proposal to replace the current wsgi framework with Pecan may increase >>> >>> the >>> >>> Neutron service's scalability, but should we continue to use a 'fire >>> >>> and >>> >>> forget' approach to notification? Being able to track the success or >>> >>> failure of a given action outside of the logs would seem pretty >>> >>> important, >>> >>> and allow for more effective coordination with Nova than is currently >>> >>> possible. >>> >> >>> >> >>> >> It strikes me that we ask an awful lot of a single neutron-server >>> >> instance - >>> >> it has to take state updates from all the agents, it has to do >>> >> scheduling, >>> >> it has to respond to API requests, and it has to communicate about >>> >> actual >>> >> changes with the agents. >>> >> >>> >> Maybe breaking some of these out the way nova has a scheduler and a >>> >> conductor and so on might be a good model (I know there are things >>> >> people >>> >> are unhappy about with nova-scheduler, but imagine how much worse it >>> >> would >>> >> be if it was built into the API). >>> >> >>> >> Doing all of those tasks, and doing it largely single threaded, is just >>> >> asking for overload. >>> >> >>> >> Cheers, >>> >> -- >>> >> Stephen Gran >>> >> Senior Systems Integrator - theguardian.com >>> >> Please consider the environment before printing this email. >>> >> ------------------------------------------------------------------ >>> >> Visit theguardian.com >>> >> On your mobile, download the Guardian iPhone app theguardian.com/iphone >>> >> and >>> >> our iPad edition theguardian.com/iPad Save up to 33% by subscribing >>> >> to the >>> >> Guardian and Observer - choose the papers you want and get full digital >>> >> access. >>> >> Visit subscribe.theguardian.com >>> >> >>> >> This e-mail and all attachments are confidential and may also >>> >> be privileged. If you are not the named recipient, please notify >>> >> the sender and delete the e-mail and all attachments immediately. >>> >> Do not disclose the contents to another person. You may not use >>> >> the information for any purpose, or store, or copy, it in any way. >>> >> >>> >> Guardian News & Media Limited is not liable for any computer >>> >> viruses or other material transmitted with or as part of this >>> >> e-mail. You should employ virus checking software. >>> >> >>> >> Guardian News & Media Limited >>> >> >>> >> A member of Guardian Media Group plc >>> >> Registered Office >>> >> PO Box 68164 >>> >> Kings Place >>> >> 90 York Way >>> >> London >>> >> N1P 2AP >>> >> >>> >> Registered in England Number 908396 >>> >> >>> >> >>> >> -------------------------------------------------------------------------- >>> >> >>> >> >>> >> >>> >> _______________________________________________ >>> >> OpenStack-dev mailing list >>> >> OpenStack-dev@lists.openstack.org >>> >> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev >>> > >>> > _______________________________________________ >>> > OpenStack-dev mailing list >>> > OpenStack-dev@lists.openstack.org >>> > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev >>> >>> >>> _______________________________________________ >>> OpenStack-dev mailing list >>> OpenStack-dev@lists.openstack.org >>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev >> >> >> >> >> >> >> _______________________________________________ >> OpenStack-dev mailing list >> OpenStack-dev@lists.openstack.org >> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev >> > > _______________________________________________ > OpenStack-dev mailing list > OpenStack-dev@lists.openstack.org > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev _______________________________________________ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev