On 12/9/24 2:14 PM, [email protected] wrote: > On Mon, 2024-12-09 at 13:37 +0100, Felix Huettner wrote: >> On Fri, Dec 06, 2024 at 10:10:21AM +0100, >> [email protected] wrote: >>> On Thu, 2024-12-05 at 15:28 +0100, Felix Huettner wrote: >>>> On Thu, Dec 05, 2024 at 02:08:37PM +0100, >>>> [email protected] wrote: >>>>> Hi Felix and Dumitru, >>>>> Thanks for posting this standalone change Felix. I'm going to >>>>> take >>>>> it >>>>> for a spin and try to integrate it to Frode's original "route >>>>> leaking" >>>>> RFC. >>>>> I have one question/suggestion for the schema that I'll leave >>>>> below. >>>>> >>>>> On Thu, 2024-12-05 at 14:00 +0100, Felix Huettner via dev >>>>> wrote: >>>>>> The Route table will be used in the future to coordinate >>>>>> routing >>>>>> information between northd and ovn-controller. >>>>>> Northd will insert routes that should be advertised to the >>>>>> outside >>>>>> fabric. >>>>>> Ovn-controller will insert routes that have been learned from >>>>>> the >>>>>> outside fabric. >>>>>> >>>>>> Signed-off-by: Felix Huettner <[email protected]> >>>>>> --- >>>>>> v2->v3: >>>>>> * refType is now strong >>>>>> * fixed typos >>>>>> >>>>>> >>>>>> ovn-sb.ovsschema | 29 ++++++++++++++++-- >>>>>> ovn-sb.xml | 80 >>>>>> ++++++++++++++++++++++++++++++++++++++++++++++++ >>>>>> 2 files changed, 107 insertions(+), 2 deletions(-) >>>>>> >>>>>> diff --git a/ovn-sb.ovsschema b/ovn-sb.ovsschema >>>>>> index 73abf2c8d..88763f429 100644 >>>>>> --- a/ovn-sb.ovsschema >>>>>> +++ b/ovn-sb.ovsschema >>>>>> @@ -1,7 +1,7 @@ >>>>>> { >>>>>> "name": "OVN_Southbound", >>>>>> - "version": "20.37.0", >>>>>> - "cksum": "1950136776 31493", >>>>>> + "version": "20.38.0", >>>>>> + "cksum": "1392903395 32893", >>>>>> "tables": { >>>>>> "SB_Global": { >>>>>> "columns": { >>>>>> @@ -617,6 +617,31 @@ >>>>>> "type": {"key": "string", "value": >>>>>> "string", >>>>>> "min": 0, "max": >>>>>> "unlimited"}}}, >>>>>> "indexes": [["chassis"]], >>>>>> + "isRoot": true}, >>>>>> + "Route": { >>>>>> + "columns": { >>>>>> + "datapath": >>>>>> + {"type": {"key": {"type": "uuid", >>>>>> + "refTable": >>>>>> "Datapath_Binding"}}}, >>>>>> + "logical_port": {"type": {"key": {"type": >>>>>> "uuid", >>>>>> + >>>>>> "refTable": >>>>>> "Port_Binding", >>>>>> + "refType": >>>>>> "strong"}}}, >>>>>> + "ip_prefix": {"type": "string"}, >>>>>> + "nexthop": {"type": "string"}, >>>>>> + "tracked_port": {"type": {"key": {"type": >>>>>> "uuid", >>>>>> + >>>>>> "refTable": >>>>>> "Port_Binding", >>>>>> + "refType": >>>>>> "strong"}, >>>>>> + "min": 0, >>>>>> + "max": 1}}, >>>>>> + "type": {"type": {"key": {"type": "string", >>>>>> + "enum": ["set", >>>>>> ["advertise", >>>>>> + >>>>>> "receive"]]}, >>>>>> + "min": 1, "max": 1}}, >>>>> >>>>> I wonder if this field is necessary. We could use presence or >>>>> absence >>>>> of "nexthop" value to determine whether the route is being >>>>> learned >>>>> or >>>>> advertised. I'm not sure if ovsdb is doing something clever >>>>> with >>>>> enums >>>>> or stores them as raw values in each row, but removing this >>>>> could >>>>> save >>>>> us some space. What do you think? >>>> >>>> Hi Martin, >>>> >>>> i think generally we could use the information in the "nexthop" >>>> column >>>> for this. However i would find this rather surprising if i am not >>>> deeply >>>> in the implementation. For debugging at least it is not obvious. >>>> >>>> I am not sure how much space saving we would get out of this, but >>>> i >>>> don't think that it is worth the cost of less debuggability. >>>> Especially >>>> since each learned/announced route will cause additionally once >>>> or >>>> multiple logical flows to be created which are significantly >>>> larger >>>> than >>>> this type field. >>>> >>>> But i am open to other opinions as well. >>>> >>>> Thanks a lot >>>> Felix >>> >>> Hi Felix, >>> I'm not hell-bent on this change and I agree that it would make the >>> code less explicit. It just occurred to me because I recently saw >>> similar pattern applied in NAT table [0] where NAT rules is >>> considered >>> "distributed" if the record contains both logical_port and >>> external_nat. >> >> Hi Martin, >> >> i think a big difference here is that we need to have both ovn-northd >> and ovn-controller agree to what each of these things mean. >> In the above example it looked like something only northd needs to >> know >> internally to derive the correct flows. >> >>> >>> I don't have enough experience with ovsdb in large deployments, it >>> was >>> just a gut feel that potentially saving ~8bytes for a table that >>> could >>> have potentially 100s or 1000s of records could be impactful. >>> There's also the fact that without the server-side validation (i.e. >>> if >>> tpye is "advertised" then nexthop must not be empty), it's up to >>> the >>> clients to enforce that invalid records are not created. >> >> I just spend some time measureing it by adding a few thousand Routes >> and >> checking the RSS difference. >> With the current schema (actually with Dumitru's requested >> adjustments) >> we are at ~3230 Byte per route. If we remove the "type" we are at >> ~2514 >> Byte. So ~716 Byte difference. >> >> If we calculate with 10.000 Routes then we end up at roughly 7MB of >> difference. >> I would think this is worth the clarity. >> >> Additionally it allows us to later actually use nexthop also for >> advertised routes. Allthough honestly i don't know a usecase for >> this. >> >> Whats your opinion on that one? >> >> Thanks a lot >> Felix > > Hi Felix, > thanks for running numbers on this. The difference indeed seems > negligible even on large deployments. You are right that the trade-off > for removing the "type" column is probably not worth it. > > I've put this version (v3) to work on Friday by integrating it to > fnordahl's original route leaking RFC and one more though occurred to > me. > > Would it be worth to split the `Route` table to `Advertised_Route` and > `Learned_Route`? I think this could be beneficial because: > * Both "types" of routes have different "owners". While "Learned > routes" are primarily managed "from the bottom", by the controller, > "Advertised routes" are managed "from the top", by northd. Having two > different components being in charge of keeping the table up to date > seems to complicate things. > * "Advertised routes" could do with just following attributes: > * IP prefix > * logical_port (strong ref to LRP that's supposed to advertise this > prefix) > * With two separate tables, development of learning/advertising would > be more independent as changing requirements in one feature wouldn't > affect development of the other. > > With the simplified structure of the `Advertised_Route` table that I > mentioned above, northd would be the only component that adds data to > this table and records would be either explicitly removed by northd > (when NAT or LB gets removed) or implicitly when the LRP is removed and > `logical_port` reference goes away. > > What do you think? >
Good point, Martin. Now that you mention it it kind of makes sense that we don't mix Advertised with Learnt. It reduces a bit complexity and chance of getting I-P wrong in ovn-controller / ovn-northd too. +1 from me for separate tables. Regards, Dumitru > Thanks, > Martin. > >> >>> >>> The issue of transparency could be solved with documentation and >>> abstraction in parsing method (like in the case of the NAT). >>> >>> Anyway, as I said, this is not a blocker from my side, just a >>> suggestion for consideration. >>> >>> Martin >>> >>> [0] >>> https://github.com/ovn-org/ovn/blob/3b32b7d08ebcd005681db5131c798da07a89133a/northd/northd.c#L16506-L16530 >>>> >>>>> >>>>> Martin. >>>>> >>>>>> + "external_ids": { >>>>>> + "type": {"key": "string", "value": >>>>>> "string", >>>>>> + "min": 0, "max": >>>>>> "unlimited"}}}, >>>>>> + "indexes": [["datapath", "logical_port", >>>>>> "ip_prefix", >>>>>> "nexthop", >>>>>> + "type"]], >>>>>> "isRoot": true} >>>>>> } >>>>>> } >>>>>> diff --git a/ovn-sb.xml b/ovn-sb.xml >>>>>> index ea4adc1c3..e435cf243 100644 >>>>>> --- a/ovn-sb.xml >>>>>> +++ b/ovn-sb.xml >>>>>> @@ -5217,4 +5217,84 @@ tcp.flags = RST; >>>>>> The set of variable values for a given chassis. >>>>>> </column> >>>>>> </table> >>>>>> + >>>>>> + <table name="Route"> >>>>>> + <p> >>>>>> + Each record represents a route that is exported from >>>>>> ovn >>>>>> or >>>>>> imported to >>>>>> + ovn using some dynamic routing logic outside of ovn. >>>>>> + It is populated on the one hand by <code>ovn- >>>>>> northd</code> >>>>>> based on the >>>>>> + addresses, routes and NAT Entries of a >>>>>> + <code>OVN_Northbound.Logical_Router_Port</code>. >>>>>> + On the other hand <code>ovn-controller</code> >>>>>> populates it >>>>>> with routes >>>>>> + learned from outside of OVN. >>>>>> + </p> >>>>>> + >>>>>> + <column name="datapath"> >>>>>> + The datapath belonging to the >>>>>> + <code>OVN_Northbound.Logical_Router</code> that this >>>>>> route >>>>>> is >>>>>> valid >>>>>> + for. >>>>>> + </column> >>>>>> + >>>>>> + <column name="logical_port"> >>>>>> + <p> >>>>>> + If the type is <code>advertise</code> then this is >>>>>> the >>>>>> logical_port >>>>>> + the router will send packets out. >>>>>> + </p> >>>>>> + >>>>>> + <p> >>>>>> + If the type is <code>receive</code> then this is the >>>>>> logical_port >>>>>> + the route was learned on. >>>>>> + </p> >>>>>> + </column> >>>>>> + >>>>>> + <column name="ip_prefix"> >>>>>> + <p> >>>>>> + IP prefix of this route (e.g. 192.168.100.0/24). >>>>>> + </p> >>>>>> + </column> >>>>>> + >>>>>> + <column name="nexthop"> >>>>>> + <p> >>>>>> + If the type is <code>advertise</code> then this is >>>>>> empty. >>>>>> + </p> >>>>>> + >>>>>> + <p> >>>>>> + If the type is <code>receive</code> then this is the >>>>>> nexthop >>>>>> ip we >>>>>> + from the outside. >>>>>> + </p> >>>>>> + </column> >>>>>> + >>>>>> + <column name="tracked_port"> >>>>>> + <p> >>>>>> + Only relevant for type <code>advertise</code>. >>>>>> + >>>>>> + In combination with a host <code>ip_prefix</code> >>>>>> this >>>>>> tracks the port >>>>>> + OVN will forward the packets for this destination >>>>>> to. >>>>>> + >>>>>> + An announcing chassis can use this information to >>>>>> check >>>>>> if >>>>>> this >>>>>> + destination is local and adjust the route priorities >>>>>> based >>>>>> on that. >>>>>> + </p> >>>>>> + </column> >>>>>> + >>>>>> + <column name="type"> >>>>>> + <p> >>>>>> + If the route is to be exported from OVN to the >>>>>> outside >>>>>> network or if >>>>>> + it is imported from the outside network. >>>>>> + </p> >>>>>> + <ul> >>>>>> + <li> >>>>>> + <code>advertise</code>: This route should be >>>>>> advertised to >>>>>> the >>>>>> + outside network. >>>>>> + </li> >>>>>> + <li> >>>>>> + <code>receive</code>: This route has been learned >>>>>> from >>>>>> the >>>>>> outside >>>>>> + network. >>>>>> + </li> >>>>>> + </ul> >>>>>> + </column> >>>>>> + >>>>>> + <column name="external_ids"> >>>>>> + See <em>External IDs</em> at the beginning of this >>>>>> document. >>>>>> + </column> >>>>>> + </table> >>>>>> </database> >>>>> >>>>> _______________________________________________ >>>>> dev mailing list >>>>> [email protected] >>>>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev >>> >>> _______________________________________________ >>> dev mailing list >>> [email protected] >>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev > > _______________________________________________ > dev mailing list > [email protected] > https://mail.openvswitch.org/mailman/listinfo/ovs-dev _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
