Thanks for your reply Joshua, some more comments inline; however I think I'm probably going off topic here given the initial subject of this thread.
Talking about Taskflow, and moving aside for the plugins, which have not a lot of community-wide interest, do you reckon Taskflow might be something that we might look at to improve reliability and efficiency in the nova/neutron interface? I have barely started to analyse how that could be done, but it something which might deserve another thread of discussion. Regards, Salvatore On 19 November 2013 23:56, Joshua Harlow <harlo...@yahoo-inc.com> wrote: > Can u explain a little how using celery achieves workflow reliability > and avoids races (or mitigates spaghetti code)? > > Celery surely does not do that. I have probably not been precise enough in my previous post. I meant to say that celery is merely the tool we are going to use for managing a distributed task queue. > To me celery acts as a way to distribute tasks, but does not deal with > actually forming a easily understandable way of knowing that a piece of > code that u design is actually going to go through the various state > transitions (or states & workflows) that u expect (this is a higher level > mechanism that u can build on-top of a distribution system). So this means > that NVP (or neutron or other?) must be maintaining an orchestration/engine > layer on-top of celery to add on this additional set of code that 'drives' > celery to accomplish a given workflow in a reliable manner. > That's pretty much correct. The neutron plugin will define workflows and manage state transitions. > > This starts to sound pretty similar to what taskflow is doing, not being > a direct competitor to a distributed task queue such as celery but > providing this higher level mechanism which adds on these benefits; since > they are needed anyway. > > To me these benefits currently are (may get bigger in the future): > > 1. A way to define a workflow (in a way that is not tied to celery, > since celeries '@task' decorator ties u to celeries internal > implementation). > - This includes ongoing work to determine how to easily define a > state-machine in a way that is relevant to cinder (and other projects). > 2. A way to keep track of the state that the workflow goes through (this > brings along resumption, progress information… when u track at the right > level). > 3. A way to execute that workflow reliably (potentially using celery, rpc, > local threads, other future hotness) > - This becomes important when u ask yourself how did u plan on > testing celery in the gate/jenkins/CI? > 4. A way to guarantee that the workflow upon failure is *automatically* > resumed by some other entity. > Thanks for this clarification. I will surely look at how the NVP plugin can leverage taskflow; surely I don't want to reinvent the wheel - or, in other words, I surely don't want to write code if that code has already been written by somebody else for me. > More details @ http://www.slideshare.net/harlowja/taskflow-27820295 > > From: Salvatore Orlando <sorla...@nicira.com> > Date: Tuesday, November 19, 2013 2:22 PM > > To: "OpenStack Development Mailing List (not for usage questions)" < > openstack-dev@lists.openstack.org> > Cc: Joshua Harlow <harlo...@yahoo-inc.com>, Isaku Yamahata < > isaku.yamah...@gmail.com>, Robert Kukura <rkuk...@redhat.com> > Subject: Re: [openstack-dev] [Neutron] Race condition between DB layer > and plugin back-end implementation > > For what is worth we have considered this aspect from the perspective > of the Neutron plugin my team maintains (NVP) during the past release > cycle. > > The synchronous model that most plugins with a controller on the backend > currently implement is simple and convenient, but has some flaws: > > - reliability: the current approach where the plugin orchestrates the > backend is not really optimal when it comes to ensuring your running > configuration (backend/control plane) is in sync with your desired > configuration (neutron/mgmt plane); moreover in some case, due to neutron > internals, API calls to the backend are wrapped in a transaction too, > leading to very long SQL transactions, which are quite dangerous indeed. It > is not easy to recover from a failure due to an eventlet thread deadlocking > with a mysql transaction, where by 'recover' I mean ensuring neutron and > backend state are in sync. > > - maintainability: since handling rollback in case of failures on the > backend and/or the db is cumbersome, this often leads to spaghetti code > which is very hard to maintain regardless of the effort (ok, I agree here > that this also depends on how good the devs are - most of the guys in my > team are very good, but unfortunately they have me too...). > > - performance & scalability: > - roundtrips to the backend take a non-negligible toll on the > duration of an API call, whereas most Neutron API calls should probably > just terminate at the DB just like a nova boot call does not wait for the > VM to be ACTIVE to return. > - we need to keep some operation serialized in order to avoid the > mentioned race issues > > For this reason we're progressively moving toward a change in the NVP > plugin with a series of patches under this umbrella-blueprint [1]. > > For answering the issues mentioned by Isaku, we've been looking at a > task management library with an efficient and reliable set of abstractions > for ensuring operations are properly ordered thus avoiding those races (I > agree on the observation on the pre/post commit solution). > We are currently looking at using celery [2] rather than taskflow; mostly > because we've already have expertise on how to use it into our > applications, and has very easy abstractions for workflow design, as well > as for handling task failures. > Said that, I think we're still open to switch to taskflow should we become > aware of some very good reason for using it. > > Regards, > Salvatore > > [1] > https://blueprints.launchpad.net/neutron/+spec/nvp-async-backend-communication > [2] http://docs.celeryproject.org/en/master/index.html > > > > On 19 November 2013 19:42, Joshua Harlow <harlo...@yahoo-inc.com> wrote: > >> And also of course, nearly forgot a similar situation/review in heat. >> >> https://review.openstack.org/#/c/49440/ >> >> Except theres was/is dealing with stack locking (a heat concept). >> >> On 11/19/13 10:33 AM, "Joshua Harlow" <harlo...@yahoo-inc.com> wrote: >> >> >If you start adding these states you might really want to consider the >> >following work that is going on in other projects. >> > >> >It surely appears that everyone is starting to hit the same problem (and >> >joining efforts would produce a more beneficial result). >> > >> >Relevant icehouse etherpads: >> >- https://etherpad.openstack.org/p/CinderTaskFlowFSM >> >- https://etherpad.openstack.org/p/icehouse-oslo-service-synchronization >> > >> >And of course my obvious plug for taskflow (which is designed to be a >> >useful library to help in all these usages). >> > >> >- https://wiki.openstack.org/wiki/TaskFlow >> > >> >The states u just mentioned start to line-up with >> >https://wiki.openstack.org/wiki/TaskFlow/States_of_Task_and_Flow >> > >> >If this sounds like a useful way to go (joining efforts) then lets see >> how >> >we can make it possible. >> > >> >IRC: #openstack-state-management is where I am usually at. >> > >> >On 11/19/13 3:57 AM, "Isaku Yamahata" <isaku.yamah...@gmail.com> wrote: >> > >> >>On Mon, Nov 18, 2013 at 03:55:49PM -0500, >> >>Robert Kukura <rkuk...@redhat.com> wrote: >> >> >> >>> On 11/18/2013 03:25 PM, Edgar Magana wrote: >> >>> > Developers, >> >>> > >> >>> > This topic has been discussed before but I do not remember if we >> have >> >>>a >> >>> > good solution or not. >> >>> >> >>> The ML2 plugin addresses this by calling each MechanismDriver twice. >> >>>The >> >>> create_network_precommit() method is called as part of the DB >> >>> transaction, and the create_network_postcommit() method is called >> after >> >>> the transaction has been committed. Interactions with devices or >> >>> controllers are done in the postcommit methods. If the postcommit >> >>>method >> >>> raises an exception, the plugin deletes that partially-created >> resource >> >>> and returns the exception to the client. You might consider a similar >> >>> approach in your plugin. >> >> >> >>Splitting works into two phase, pre/post, is good approach. >> >>But there still remains race window. >> >>Once the transaction is committed, the result is visible to outside. >> >>So the concurrent request to same resource will be racy. >> >>There is a window after pre_xxx_yyy before post_xxx_yyy() where >> >>other requests can be handled. >> >> >> >>The state machine needs to be enhanced, I think. (plugins need >> >>modification) >> >>For example, adding more states like pending_{create, delete, update}. >> >>Also we would like to consider serializing between operation of ports >> >>and subnets. or between operation of subnets and network depending on >> >>performance requirement. >> >>(Or carefully audit complex status change. i.e. >> >>changing port during subnet/network update/deletion.) >> >> >> >>I think it would be useful to establish reference locking policy >> >>for ML2 plugin for SDN controllers. >> >>Thoughts or comments? If this is considered useful and acceptable, >> >>I'm willing to help. >> >> >> >>thanks, >> >>Isaku Yamahata >> >> >> >>> -Bob >> >>> >> >>> > Basically, if concurrent API calls are sent to Neutron, all of them >> >>>are >> >>> > sent to the plug-in level where two actions have to be made: >> >>> > >> >>> > 1. DB transaction ? No just for data persistence but also to collect >> >>>the >> >>> > information needed for the next action >> >>> > 2. Plug-in back-end implementation ? In our case is a call to the >> >>>python >> >>> > library than consequentially calls PLUMgrid REST GW (soon SAL) >> >>> > >> >>> > For instance: >> >>> > >> >>> > def create_port(self, context, port): >> >>> > with context.session.begin(subtransactions=True): >> >>> > # Plugin DB - Port Create and Return port >> >>> > port_db = super(NeutronPluginPLUMgridV2, >> >>> > self).create_port(context, >> >>> > >> >>> port) >> >>> > device_id = port_db["device_id"] >> >>> > if port_db["device_owner"] == "network:router_gateway": >> >>> > router_db = self._get_router(context, device_id) >> >>> > else: >> >>> > router_db = None >> >>> > try: >> >>> > LOG.debug(_("PLUMgrid Library: create_port() >> >>>called")) >> >>> > # Back-end implementation >> >>> > self._plumlib.create_port(port_db, router_db) >> >>> > except Exception: >> >>> > Š >> >>> > >> >>> > The way we have implemented at the plugin-level in Havana (even in >> >>> > Grizzly) is that both action are wrapped in the same "transaction" >> >>>which >> >>> > automatically rolls back any operation done to its original state >> >>> > protecting mostly the DB of having any inconsistency state or left >> >>>over >> >>> > data if the back-end part fails.=. >> >>> > The problem that we are experiencing is when concurrent calls to the >> >>> > same API are sent, the number of operation at the plug-in back-end >> >>>are >> >>> > long enough to make the next concurrent API call to get stuck at the >> >>>DB >> >>> > transaction level, which creates a hung state for the Neutron Server >> >>>to >> >>> > the point that all concurrent API calls will fail. >> >>> > >> >>> > This can be fixed if we include some "locking" system such as >> >>>calling: >> >>> > >> >>> > from neutron.common import utile >> >>> > Š >> >>> > >> >>> > @utils.synchronized('any-name', external=True) >> >>> > def create_port(self, context, port): >> >>> > Š >> >>> > >> >>> > Obviously, this will create a serialization of all concurrent calls >> >>> > which will ends up in having a really bad performance. Does anyone >> >>>has a >> >>> > better solution? >> >>> > >> >>> > Thanks, >> >>> > >> >>> > Edgar >> >>> > >> >>> > >> >>> > _______________________________________________ >> >>> > OpenStack-dev mailing list >> >>> > OpenStack-dev@lists.openstack.org >> >>> > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev >> >>> > >> >>> >> >>> >> >>> _______________________________________________ >> >>> OpenStack-dev mailing list >> >>> OpenStack-dev@lists.openstack.org >> >>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev >> >> >> >>-- >> >>Isaku Yamahata <isaku.yamah...@gmail.com> >> >> >> >>_______________________________________________ >> >>OpenStack-dev mailing list >> >>OpenStack-dev@lists.openstack.org >> >>http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev >> > >> > >> >_______________________________________________ >> >OpenStack-dev mailing list >> >OpenStack-dev@lists.openstack.org >> >http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev >> >> >> _______________________________________________ >> OpenStack-dev mailing list >> OpenStack-dev@lists.openstack.org >> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev >> > >
_______________________________________________ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev