Re: [openstack-dev] [neutron] Mechanism drivers and Neutron server forking?

Terry Wilson Wed, 10 Jun 2015 08:23:10 -0700

There are two classes of behavior that need to be handled:

1) There are things that can only be done after forking like setting up 
connections or spawning threads.
2) Some things should only be done once regardless of number of forks, like 
syncing.


Even when you just want something to happen once, there is a good chance you 
may need that to happen post-fork. For example, syncing between OVSDB and 
neutron databases requires a socket connection and we don't want to have it 
going on 16 times.

Case 1 is a little complex due to how we launch api/rpc worker threads. The 
obvious place to notify that a fork is complete is in the 
RpcWorker/WorkerService start() methods, since they are the only code outside 
of openstack.common.service that is really called post-fork. The problem is the 
case where api_workers==rpc_workers==0. In this case, the parent process calls 
start() on both, so you end up with two calls to your post-fork initialization 
and only one process. It is easy enough to pass whether or not start() should 
call the initialization, or whether we hold off and let the main process do it 
before calling waitall()--it's just a bit ugly (see my patch: 
https://review.openstack.org/#/c/189391/).

Another option to handle case 1 would be to kill the case where you have a 
single process handling both workers. Always have the parent do nothing, and 
fork a process for each api/rpc worker treating workers=0 as workers=1. Then, 
start() can safely be used without hacking around the special case.

Case 2 the problem is "which process is *the one*?" The fork() call happens in 
the weird bastardized eventlet-threading hybrid openstack.common ThreadGroup 
stuff, so who knows what order things are really happening. The easiest thing 
to detect as unique is the parent process through some plugin pre-fork call 
that stores the parent's pid. The problem with using the parent process for the 
'do it once' case is that we have to be able to guarantee that all the forking 
is really done, and it happens eventlety. Maybe an accumulator that fires off 
an event when api_workers + rpc_workers fork() events received? Anyway, it's 
messy.

Another option for 2 would be to let the plugin specify that it needs its own 
worker process. If so, spawn it call PluginWorker.start() which initializes 
after-fork. Seems like it could be cleaner.

Right now I'm leaning toward "parent always does nothing" + PluginWorker. 
Everything is forked, no special case for workers==0, and explicit designation 
of the "only one" case. Of course, it's still early in the day and I haven't 
had any coffee.

Terry

----- Original Message -----
> This depends on what initialize is supposed to be doing. If it's just a
> one-time sync with a back-end, then I think calling it once in each child
> process might not be what we want.
> 
> I left a comment on Terry's patch. I think we should just use the callback
> manager to have a pre-fork and post-fork even to let drivers/plugins do
> whatever is appropriate for them.
> 
> On Mon, Jun 8, 2015 at 1:00 PM, Robert Kukura < kuk...@noironetworks.com >
> wrote:
> 
> 
> 
> From a driver's perspective, it would be simpler, and I think sufficient, to
> change ML2 to call initialize() on drivers after the forking, rather than
> requiring drivers to know about forking.
> 
> -Bob
> 
> 
> On 6/8/15 2:59 PM, Armando M. wrote:
> 
> 
> 
> Interestingly, [1] was filed a few moments ago:
> 
> [1] https://bugs.launchpad.net/neutron/+bug/1463129
> 
> On 2 June 2015 at 22:48, Salvatore Orlando < sorla...@nicira.com > wrote:
> 
> 
> 
> I'm not sure if you can test this behaviour on your own because it requires
> the VMware plugin and the eventlet handling of backend response.
> 
> But the issue was manifesting and had to be fixed with this mega-hack [1].
> The issue was not about several workers executing the same code - the
> loopingcall was always started on a single thread. The issue I witnessed was
> that the other API workers just hang.
> 
> There's probably something we need to understand about how eventlet can work
> safely with a os.fork (I just think they're not really made to work
> together!).
> Regardless, I did not spent too much time on it, because I thought that the
> multiple workers code might have been rewritten anyway by the pecan switch
> activities you're doing.
> 
> Salvatore
> 
> 
> [1] https://review.openstack.org/#/c/180145/
> 
> On 3 June 2015 at 02:20, Kevin Benton < blak...@gmail.com > wrote:
> 
> 
> 
> Sorry about the long delay.
> 
> > Even the LOG.error("KEVIN PID=%s network response: %s" % (os.getpid(),
> > r.text)) line? Surely the server would have forked before that line was
> > executed - so what could prevent it from executing once in each forked
> > process, and hence generating multiple logs?
> 
> Yes, just once. I wasn't able to reproduce the behavior you ran into. Maybe
> eventlet has some protection for this? Can you provide small sample code for
> the logging driver that does reproduce the issue?
> 
> On Wed, May 13, 2015 at 5:19 AM, Neil Jerram < neil.jer...@metaswitch.com >
> wrote:
> 
> 
> Hi Kevin,
> 
> Thanks for your response...
> 
> On 08/05/15 08:43, Kevin Benton wrote:
> 
> 
> I'm not sure I understand the behavior you are seeing. When your
> mechanism driver gets initialized and kicks off processing, all of that
> should be happening in the parent PID. I don't know why your child
> processes start executing code that wasn't invoked. Can you provide a
> pointer to the code or give a sample that reproduces the issue?
> 
> https://github.com/Metaswitch/calico/tree/master/calico/openstack
> 
> Basically, our driver's initialize method immediately kicks off a green
> thread to audit what is now in the Neutron DB, and to ensure that the other
> Calico components are consistent with that.
> 
> 
> 
> I modified the linuxbridge mech driver to try to reproduce it:
> http://paste.openstack.org/show/216859/
> 
> In the output, I never received any of the init code output I added more
> than once, including the function spawned using eventlet.
> 
> Interesting. Even the LOG.error("KEVIN PID=%s network response: %s" %
> (os.getpid(), r.text)) line? Surely the server would have forked before that
> line was executed - so what could prevent it from executing once in each
> forked process, and hence generating multiple logs?
> 
> Thanks,
> Neil
> 
> 
> 
> The only time I ever saw anything executed by a child process was actual
> API requests (e.g. the create_port method).
> 
> 
> 
> 
> 
> On Thu, May 7, 2015 at 6:08 AM, Neil Jerram < neil.jer...@metaswitch.com
> <mailto: neil.jer...@metaswitch.com >> wrote:
> 
> Is there a design for how ML2 mechanism drivers are supposed to cope
> with the Neutron server forking?
> 
> What I'm currently seeing, with api_workers = 2, is:
> 
> - my mechanism driver gets instantiated and initialized, and
> immediately kicks off some processing that involves communicating
> over the network
> 
> - the Neutron server process then forks into multiple copies
> 
> - multiple copies of my driver's network processing then continue,
> and interfere badly with each other :-)
> 
> I think what I should do is:
> 
> - wait until any forking has happened
> 
> - then decide (somehow) which mechanism driver is going to kick off
> that processing, and do that.
> 
> But how can a mechanism driver know when the Neutron server forking
> has happened?
> 
> Thanks,
> Neil
> 
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe:
> openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> < http://openstack-dev-requ...@lists.openstack.org?subject:unsubscribe >
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 
> 
> 
> 
> --
> Kevin Benton
> 
> 
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 
> 
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 
> 
> 
> --
> Kevin Benton
> 
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 
> 
> 
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 
> 
> 
> 
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 
> 
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 
> 
> 
> 
> --
> Kevin Benton
> 
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 

__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [neutron] Mechanism drivers and Neutron server forking?

Reply via email to