Re: [ovs-dev] [RFC 00/14] ovn-controller incremental processing.

Mark Michelson Mon, 06 Aug 2018 06:13:21 -0700

Hi Han,

I thought about this more over the weekend, and I was hoping I'd get torespond to my own e-mail before you saw it, because I realized I had afundamental misunderstanding of the scope and nature of change handlers.I'll reply to your comments in-line below.


On 08/05/2018 03:11 PM, Han Zhou wrote:

Hi Mark,
Thanks for the review and very valuable comments! (I was on vacationlast week so sorry for slow response).
On Tue, Jul 31, 2018 at 3:47 AM, Mark Michelson <mmich...@redhat.com<mailto:mmich...@redhat.com>> wrote:
 >
 > Hi Han,
 >
> I've given this patchset a look, and I was following along prettywell until I got to about patch 11. From that point on, I had to re-readthe code more times than I care to admit before I finally understoodwhat was going on :)
 >
> What you have is a structure (lflow_ref_list_node) that issimultaneously in two lists. These two lists are each the data inseparate hmap nodes. The hmap nodes are in two separate hmaps. One hmapuses the reference type and name as a key, and the other uses the lflowUUID as a key. This way given an address set name, you can find theassociated logical flow UUIDs in the ref_lflow_table. Or given a logicalflow UUID, you can find address sets.
 >
 > I'm wondering if this can be simplified somehow.
 >
> Right now, if logical flows change, the change handler has to updatethe ref_lflow_table so that address sets no longer reference thatlogical flow. If address sets change, then the lflow_ref_table isupdated. In both cases consider_logical_flow() gets called and realignsthe tables as appropriate.
 >
> The problem with this is that it reeks of cross-cutting concerns, andit seems like it wouldn't scale well (consider a 3- or 4-chain ofdependencies). Ideally, the dependency chain would make sure that thechange handler for logical flows only deals with logical flows, and thechange handler for address sets only deals with address sets.
I agree that maintaining the cross reference has some overhead, but Idon't see a scaling issue in this case. Adding entries to the crossreference table is a by-product of the consider_logical_flow() whenparsing the lflow, and deleting the entries is also efficient with O(N),N = number of address sets used by a lflow, which in most cases shouldbe a very small number (correct me if I am wrong). As to memoryconsumption, it maintains only the mapping between resource names andlflow uuids, so I don't expect it to be a significant cost either. Couldyou explain a little more about the "3- or 4-chain of dependencies" example?

My thought was along the lines that table A references table B, whichreferences table C. A change in table A might result in a change totable B, which then results in a change to table C. In my head, Ithought this would mean having to maintain a large series of hashmapsthat cross-referenced each other. I realize this isn't correct, though,and that as long as the dependency chain is linear, this isn't anydifferent from what you already have proposed here.

In reality, I can't think of any example changes in the southbounddatabase that would cause such a series of events, but I may not bethinking hard enough :)

For the "cross-cutting concern", I don't see it that way. I see it as apattern of change handler implementation. In general, output of anengine node is the result of a "join" operation of its inputs. Whenthere are multiple inputs and one of them changes, for a change handlerto compute the output incrementally, we will need to use the changed rowto probe all the other inputs to update the final output. For theaddress-set and port-group handlers, it is join between two tables only,and the cross reference table is built to make the probing efficient inchange handler. The cross reference table is also generalized so thatany resources referenced by logical flows can reuse the same datastructure and interfaces, and now it is reused by both address-sets andport-groups. We can make it more generalized to be used for othermappings if needed.

Yes, and this sort of thinking is what I had over the weekend that mademe have a "Eureka!" moment and realize what I had been missing here.

I had been looking at the address set change handler and thinking of thechange handler as being an "owner" of address set data. The reason isthat the engine node is tied directly to updates to the address settable. It felt like it was overstepping its boundaries by then steppingthrough data that I thought was owned by the logical flow changehandler. The fact that both change handlers acted on the same data juststruck me as wrong.

However, I realized I need to stop thinking about data ownership in thatsort of way when it comes to change handlers. Engine nodes do haveownership like I was imagining in their run() method, but changehandlers are very different. They are responsible for analyzing morethan just the data that has changed, but also for analyzing therelationship that data has with other data. That other data may or maynot be tied to other engine nodes.

If we really wants to make it more generalized, I think the answer isthe datalog approach. I would be great if it can be implemented thatway, but I am pessimistic for it to be applied to ovn-controller in apractical time line, given that ovn-controller is more complex in termsof both data sources and processing logic compared with ovn-northd. AndI think it is practical and simple to implement the probing for mostfrequent scenarios as demonstrated by this RFC.


I agree. I don't think datalog is the correct approach for ovn-controller.

 >
> If we generalize things a bit, there are likely to be two waysdependencies manifest in the database. In this particular case, text inone row expands to data of a separate database row. The other case wouldbe where a database row contains the UUID (or list of UUIDs) of otherdatabase rows.
 >
> For the textual case, I think the easiest way to handle this is toreplace the text with what it expands to earlier than when we currentlydo it. Consider that a logical flow references address set $foo.Currently, the logical flow in the southbound database has the text"$foo" in it. If $foo were replaced with the actual addresses from theaddress set, then when an address set changes, the text of the logicalflow would change as well, thus resulting in a direct change of thelogical flow. A less disruptive version of this might be to use somereserved character automatically in the logical flow match followed by asequence number. So for instance, if a logical flow were set up toreference address set $foo, then the actual logical flow might besomething like $foo?1. Then if northbound address set foo changes, thelogical flow could be updated to $foo?2 by ovn-northd. Again, thetextual change in the logical flow would result in triggering the changetracker.
 >
This proposal is interesting and I think it is a valid alternative. Itis trying to implement the probing without maintaining a cross referencetable in ovn-controller. In fact it moves the effort of building thereference table from ovn-controller to maintaining the sequence numberfor each address-set/port-group resources in ovn-northd. I am just notsure if this makes the system simpler or more complex. I will need tothink more about it.

Yes, having thought about this some more, I agree that this could justbe trading one complexity for another. Plus, aside from address sets andport groups, I'm not sure that there is any other text expansion typereferences in the southbound database. So engineering a big solution forthis may not have a lot of bang for your buck. It may be worthworkshopping just to see, though.

> For the database referencing case, it would be nice if the IDL changetracking code could automatically do this for us. This way if record foohas a column that references row bar, then if bar changes, we would betold that foo also changed. This strikes me as difficult to implementand could result in some interesting dependency graphs within the IDLcode though.
 >
 > What do you think?
 >
For the database referencing case, it seems not directly related to yourconcern regarding address sets handling (or port-group handling). Pleasecorrect me if I misunderstood something here. But I agree with idea ofutilizing and improving the IDL capability to build the dependency graphfor table references, and this is exactly in my TODO as mentioned in thecover letter:

You are correct that this does not apply to address set or port groupreferencing by logical flows. The relationship between logical flows andaddress sets and port groups is not currently expressible at the IDLlevel. I was thinking ahead a bit about how other tables may refer toeach other. I foresaw similar structures in change handlers for thosetables and wondered if that could be handled at the IDL level instead.


"For exposing the dependencies introduced by reference access, it is a big
TODO item and it is the major reason this patch series is RFC only."



Thanks,
Han


_______________________________________________
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [RFC 00/14] ovn-controller incremental processing.

Reply via email to