On Thu, Jun 07, 2018 at 04:32:10PM -0700, Han Zhou wrote: > On Thu, Jun 7, 2018 at 11:15 AM, Ben Pfaff <b...@ovn.org> wrote: > > > > To review the problem we're trying to solve, ovn-controller uses too > > much CPU in many circumstances because it recomputes everything on every > > iteration on the main loop. That includes when it wakes up for any > > reason, e.g. in response to a timer, or to an "ovs-appctl" command, or > > to respond to an "echo request" probing whether an OpenFlow or OVSDB > > connection is still up. In even a medium-sized network, this is a lot > > of work and can easily lead to ovn-controller using 100% of a CPU > > without doing significant valuable work. > > > > The solution, as you and others have noticed, is to recompute only when > > an input has changed. This can be done on at least three different > > granularities: > > > > 1. Whole system. Recompute everything if any input changed. > > > > 2. Coarsely granular. Recompute intermediate data and output data, > coarsely > > defined, if their respective inputs changed. > > > > 3. Finely granular. Recompute individual rows (etc.) of intermediate or > output > > data if their individual inputs changed. > > > > The 2016 patches from Ryan Moats were mostly #1 with some #2, and, Han, > > as I see it your patch series is mostly #2 with some #3. > > > > Ryan Moats' series was eventually reverted because it was not correct and > could > > not be confidently fixed or maintained. In turn, this was because it was > too > > difficult to figure out each calculation's actual dependencies. This was > in > > fact difficult even for #1, let alone #2. > > > > Your new patch series uses a much more systematic approach to > > dependencies, since it uses a formal incremental processing engine to > > propagate changes through a DAG of computations. However, assuming that > > it is correct as is, I am concerned that it will eventually suffer from > > the same problem. > > > > The way I see it, the incremental computation systems that have been > proposed > > so far are analogous to a Makefile that explicitly lists all of the .h > files > > that each .c file includes (directly or indirectly). If you've ever > maintained > > a Makefile that works this way, you know that it's difficult to get it > right at > > the start and that it is essentially impossible to keep it correct as a > project > > evolves. That's why every modern build system computes header file > > dependencies automatically as a side effect of compilation. Similarly, I > don't > > see a realistic way that we can keep these dependencies correct as > > ovn-controller evolves. > > > > So, I've been spending some time thinking about how we can come up with > the > > ovn--controller equivalent of automatic dependency computation. My > current > > thinking is that we need to adopt a more functional programming attitude > here. > > That is, any given part of the ovn-controller computation should only be > passed > > as input the data that it actually needs, and as output the data > structures > > that it produces. This will give us the opportunity to make sure that the > > input and the dependencies are in sync. Ideally, we'd be able to tie > those > > together in the code, so that passing a piece of data into a computation > > function automatically makes that a dependency from the engine's > perspective. > > Agree. While I don't think it is totally unrealistic, but it is indeed my > biggest concern how to ensure the dependencies are correctly maintained in > the longer term. It was a manual and tedious process for me (although it > helped me to get better understanding of the whole implementation, as a > personal benefit) to figure out the dependencies, by looking at the > parameters passed to those xxx_run() functions, and also check carefully > the global variables, and specially, the OVSDB tables they might be using. > If I miss anything, or in the future some contributor miss updating the > dependency graph when they should, it will result in problem. It would be > great if we have a mechanism to self-describe the dependency from code and > then deduce the dependencies automatically (or it might be good enough to > still adding dependency manually, but relying on detecting and warning for > missing dependencies during make). I have no concrete idea of the > implementation though, any detailed suggestion would be helpful.
I've been playing with some ideas about that this week. I hope to have something to show soon. > > Currently, we're pretty far from being able to do this. Most of the state > > computation functions in ovn-controller receive pointers to the OVSDB and > OVNSB > > IDLs (via struct controller_ctx), which means that they can read and write > > anything in any part of those databases, and even if they don't now it's > easy > > to carelessly add new ones. To pass only a function's actual > dependencies into > > it, we will need to pass, at a minimum, individual tables into a function > > rather than entire databases. This is more difficult than it sounds, > because > > the OVSDB IDL doesn't have a way to pass around a table. Thus, some > steps to > > take here, starting from the easiest: > > > > 1. Add some ability for the IDL to pass around tables. Exact mechanism > not > > obvious yet. > > We need to consider the index usage, too, which also requires passing the > IDL. I can see that, too. I don't think it's too hard to adapt the indexes in a similar way. > > 2. Add some ability for the IDL to pass just part of a table, e.g. > specific > > columns, to make dependencies more specific. This would also make it > > possible to control dependencies related to hopping from one table to > > another via pointers in the table's columns. > > I am not sure how would this help for the "hopping" case. If node N depends > on column C1 and C2 of table T. Assume C2 refers to table T2. If node N's > compute function accesses T2's data, how would this access be controlled by > the mechanism? Presumably, if N depends on C1 and C2, and C2 refers to table T2, then N must also declare a dependency on T2 as well. I think that this can be enforced, by preventing a node from depending on a column without depending on the table that the column references. > > (Not every input comes from an OVSDB instance. Other input sources tend > to be > > easier to deal with. I don't know of specific problems there yet.) > > It seems to me really no other input sources related to flow computing. Do > you have anything in mind? mff_ovn_geneve is the example that comes to mind. > > There's another issue related to OVSDB. Currently ovn-controller only > composes > > OVSDB transactions when there is not already a transaction in flight. > This > > causes trouble for the interaction between some parts of ovn-controller > that > > only do their work when they can compose a transaction and parts that > need to > > keep track of individual changes in tables ("change tracking") because > change > > tracking only keeps the changes for a single main loop iteration. Thus, > change > > tracking changes can be missed if there's a transaction outstanding. It > seems > > like there should be a way to solve this problem by always allowing a > > transaction to be composed (and probably just discarding the transaction > in > > some cases). > > Current patch handles this situation by forcing a full recompute in next > iteration immediately. Being able to either always compose a transaction or > keep track of changes across iterations would avoid the unnecessary > recomputes. This is about a further performance improvement rather than > correctness, so I don't see it urgent for now. I think that's a good summary of the situation. > > I'm currently working on step #1 above, to make dependencies easier to > track by > > passing individual tables into functions. At that point, I think that the > > incremental processing engine makes a lot of sense. > > This is great! Looking forward to the updates and let me know anything else > needed from my side. Thanks again. I think I'm going to try to send out patches, even ones that are incomplete solutions, to see if anyone has comments, pretty soon. _______________________________________________ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev