On 12/9/24 2:14 PM, [email protected] wrote:
> On Mon, 2024-12-09 at 13:37 +0100, Felix Huettner wrote:
>> On Fri, Dec 06, 2024 at 10:10:21AM +0100,
>> [email protected] wrote:
>>> On Thu, 2024-12-05 at 15:28 +0100, Felix Huettner wrote:
>>>> On Thu, Dec 05, 2024 at 02:08:37PM +0100,
>>>> [email protected] wrote:
>>>>> Hi Felix and Dumitru,
>>>>> Thanks for posting this standalone change Felix. I'm going to
>>>>> take
>>>>> it
>>>>> for a spin and try to integrate it to Frode's original "route
>>>>> leaking"
>>>>> RFC.
>>>>> I have one question/suggestion for the schema that I'll leave
>>>>> below.
>>>>>
>>>>> On Thu, 2024-12-05 at 14:00 +0100, Felix Huettner via dev
>>>>> wrote:
>>>>>> The Route table will be used in the future to coordinate
>>>>>> routing
>>>>>> information between northd and ovn-controller.
>>>>>> Northd will insert routes that should be advertised to the
>>>>>> outside
>>>>>> fabric.
>>>>>> Ovn-controller will insert routes that have been learned from
>>>>>> the
>>>>>> outside fabric.
>>>>>>
>>>>>> Signed-off-by: Felix Huettner <[email protected]>
>>>>>> ---
>>>>>> v2->v3:
>>>>>>   * refType is now strong
>>>>>>   * fixed typos
>>>>>>         
>>>>>>
>>>>>>  ovn-sb.ovsschema | 29 ++++++++++++++++--
>>>>>>  ovn-sb.xml       | 80
>>>>>> ++++++++++++++++++++++++++++++++++++++++++++++++
>>>>>>  2 files changed, 107 insertions(+), 2 deletions(-)
>>>>>>
>>>>>> diff --git a/ovn-sb.ovsschema b/ovn-sb.ovsschema
>>>>>> index 73abf2c8d..88763f429 100644
>>>>>> --- a/ovn-sb.ovsschema
>>>>>> +++ b/ovn-sb.ovsschema
>>>>>> @@ -1,7 +1,7 @@
>>>>>>  {
>>>>>>      "name": "OVN_Southbound",
>>>>>> -    "version": "20.37.0",
>>>>>> -    "cksum": "1950136776 31493",
>>>>>> +    "version": "20.38.0",
>>>>>> +    "cksum": "1392903395 32893",
>>>>>>      "tables": {
>>>>>>          "SB_Global": {
>>>>>>              "columns": {
>>>>>> @@ -617,6 +617,31 @@
>>>>>>                      "type": {"key": "string", "value":
>>>>>> "string",
>>>>>>                               "min": 0, "max":
>>>>>> "unlimited"}}},
>>>>>>              "indexes": [["chassis"]],
>>>>>> +            "isRoot": true},
>>>>>> +        "Route": {
>>>>>> +            "columns": {
>>>>>> +                "datapath":
>>>>>> +                    {"type": {"key": {"type": "uuid",
>>>>>> +                                      "refTable":
>>>>>> "Datapath_Binding"}}},
>>>>>> +                "logical_port": {"type": {"key": {"type":
>>>>>> "uuid",
>>>>>> +                                                 
>>>>>> "refTable":
>>>>>> "Port_Binding",
>>>>>> +                                                  "refType":
>>>>>> "strong"}}},
>>>>>> +                "ip_prefix": {"type": "string"},
>>>>>> +                "nexthop": {"type": "string"},
>>>>>> +                "tracked_port": {"type": {"key": {"type":
>>>>>> "uuid",
>>>>>> +                                                 
>>>>>> "refTable":
>>>>>> "Port_Binding",
>>>>>> +                                                  "refType":
>>>>>> "strong"},
>>>>>> +                                          "min": 0,
>>>>>> +                                          "max": 1}},
>>>>>> +                "type": {"type": {"key": {"type": "string",
>>>>>> +                                          "enum": ["set",
>>>>>> ["advertise",
>>>>>> +                                                          
>>>>>> "receive"]]},
>>>>>> +                                    "min": 1, "max": 1}},
>>>>>
>>>>> I wonder if this field is necessary. We could use presence or
>>>>> absence
>>>>> of "nexthop" value to determine whether the route is being
>>>>> learned
>>>>> or
>>>>> advertised. I'm not sure if ovsdb is doing something clever
>>>>> with
>>>>> enums
>>>>> or stores them as raw values in each row, but removing this
>>>>> could
>>>>> save
>>>>> us some space. What do you think? 
>>>>
>>>> Hi Martin,
>>>>
>>>> i think generally we could use the information in the "nexthop"
>>>> column
>>>> for this. However i would find this rather surprising if i am not
>>>> deeply
>>>> in the implementation. For debugging at least it is not obvious.
>>>>
>>>> I am not sure how much space saving we would get out of this, but
>>>> i
>>>> don't think that it is worth the cost of less debuggability.
>>>> Especially
>>>> since each learned/announced route will cause additionally once
>>>> or
>>>> multiple logical flows to be created which are significantly
>>>> larger
>>>> than
>>>> this type field.
>>>>
>>>> But i am open to other opinions as well.
>>>>
>>>> Thanks a lot
>>>> Felix
>>>
>>> Hi Felix,
>>> I'm not hell-bent on this change and I agree that it would make the
>>> code less explicit. It just occurred to me because I recently saw
>>> similar pattern applied in NAT table [0] where NAT rules is
>>> considered
>>> "distributed" if the record contains both logical_port and
>>> external_nat.
>>
>> Hi Martin,
>>
>> i think a big difference here is that we need to have both ovn-northd
>> and ovn-controller agree to what each of these things mean.
>> In the above example it looked like something only northd needs to
>> know
>> internally to derive the correct flows.
>>
>>>
>>> I don't have enough experience with ovsdb in large deployments, it
>>> was
>>> just a gut feel that potentially saving ~8bytes for a table that
>>> could
>>> have potentially 100s or 1000s of records could be impactful.
>>> There's also the fact that without the server-side validation (i.e.
>>> if
>>> tpye is "advertised" then nexthop must not be empty), it's up to
>>> the
>>> clients to enforce that invalid records are not created.
>>
>> I just spend some time measureing it by adding a few thousand Routes
>> and
>> checking the RSS difference.
>> With the current schema (actually with Dumitru's requested
>> adjustments)
>> we are at ~3230 Byte per route. If we remove the "type" we are at
>> ~2514
>> Byte. So ~716 Byte difference.
>>
>> If we calculate with 10.000 Routes then we end up at roughly 7MB of
>> difference.
>> I would think this is worth the clarity.
>>
>> Additionally it allows us to later actually use nexthop also for
>> advertised routes. Allthough honestly i don't know a usecase for
>> this.
>>
>> Whats your opinion on that one?
>>
>> Thanks a lot
>> Felix
> 
> Hi Felix,
> thanks for running numbers on this. The difference indeed seems
> negligible even on large deployments. You are right that the trade-off
> for removing the "type" column is probably not worth it.
> 
> I've put this version (v3) to work on Friday by integrating it to
> fnordahl's original route leaking RFC and one more though occurred to
> me.
> 
> Would it be worth to split the `Route` table to `Advertised_Route` and
> `Learned_Route`? I think this could be beneficial because:
>   * Both "types" of routes have different "owners". While "Learned
> routes" are primarily managed "from the bottom", by the controller,
> "Advertised routes" are managed "from the top", by northd. Having two
> different components being in charge of keeping the table up to date
> seems to complicate things.
>   * "Advertised routes" could do with just following attributes:
>     * IP prefix
>     * logical_port (strong ref to LRP that's supposed to advertise this
> prefix)
>   * With two separate tables, development of learning/advertising would
> be more independent as changing requirements in one feature wouldn't
> affect development of the other.
> 
> With the simplified structure of the `Advertised_Route` table that I
> mentioned above, northd would be the only component that adds data to
> this table and records would be either explicitly removed by northd
> (when NAT or LB gets removed) or implicitly when the LRP is removed and
> `logical_port` reference goes away.
> 
> What do you think?
> 

Good point, Martin.  Now that you mention it it kind of makes sense that
we don't mix Advertised with Learnt.  It reduces a bit complexity and
chance of getting I-P wrong in ovn-controller / ovn-northd too.

+1 from me for separate tables.

Regards,
Dumitru

> Thanks,
> Martin.
> 
>>
>>>
>>> The issue of transparency could be solved with documentation and
>>> abstraction in parsing method (like in the case of the NAT).
>>>
>>> Anyway, as I said, this is not a blocker from my side, just a
>>> suggestion for consideration.
>>>
>>> Martin
>>>
>>> [0]
>>> https://github.com/ovn-org/ovn/blob/3b32b7d08ebcd005681db5131c798da07a89133a/northd/northd.c#L16506-L16530
>>>>
>>>>>
>>>>> Martin.
>>>>>
>>>>>> +                "external_ids": {
>>>>>> +                    "type": {"key": "string", "value":
>>>>>> "string",
>>>>>> +                             "min": 0, "max":
>>>>>> "unlimited"}}},
>>>>>> +            "indexes": [["datapath", "logical_port",
>>>>>> "ip_prefix",
>>>>>> "nexthop",
>>>>>> +                         "type"]],
>>>>>>              "isRoot": true}
>>>>>>      }
>>>>>>  }
>>>>>> diff --git a/ovn-sb.xml b/ovn-sb.xml
>>>>>> index ea4adc1c3..e435cf243 100644
>>>>>> --- a/ovn-sb.xml
>>>>>> +++ b/ovn-sb.xml
>>>>>> @@ -5217,4 +5217,84 @@ tcp.flags = RST;
>>>>>>        The set of variable values for a given chassis.
>>>>>>      </column>
>>>>>>    </table>
>>>>>> +
>>>>>> +  <table name="Route">
>>>>>> +    <p>
>>>>>> +      Each record represents a route that is exported from
>>>>>> ovn
>>>>>> or
>>>>>> imported to
>>>>>> +      ovn using some dynamic routing logic outside of ovn.
>>>>>> +      It is populated on the one hand by <code>ovn-
>>>>>> northd</code>
>>>>>> based on the
>>>>>> +      addresses, routes and NAT Entries of a
>>>>>> +      <code>OVN_Northbound.Logical_Router_Port</code>.
>>>>>> +      On the other hand <code>ovn-controller</code>
>>>>>> populates it
>>>>>> with routes
>>>>>> +      learned from outside of OVN.
>>>>>> +    </p>
>>>>>> +
>>>>>> +    <column name="datapath">
>>>>>> +      The datapath belonging to the
>>>>>> +      <code>OVN_Northbound.Logical_Router</code> that this
>>>>>> route
>>>>>> is
>>>>>> valid
>>>>>> +      for.
>>>>>> +    </column>
>>>>>> +
>>>>>> +    <column name="logical_port">
>>>>>> +      <p>
>>>>>> +        If the type is <code>advertise</code> then this is
>>>>>> the
>>>>>> logical_port
>>>>>> +        the router will send packets out.
>>>>>> +      </p>
>>>>>> +
>>>>>> +      <p>
>>>>>> +        If the type is <code>receive</code> then this is the
>>>>>> logical_port
>>>>>> +        the route was learned on.
>>>>>> +      </p>
>>>>>> +    </column>
>>>>>> +
>>>>>> +    <column name="ip_prefix">
>>>>>> +      <p>
>>>>>> +        IP prefix of this route (e.g. 192.168.100.0/24).
>>>>>> +      </p>
>>>>>> +    </column>
>>>>>> +
>>>>>> +    <column name="nexthop">
>>>>>> +      <p>
>>>>>> +        If the type is <code>advertise</code> then this is
>>>>>> empty.
>>>>>> +      </p>
>>>>>> +
>>>>>> +      <p>
>>>>>> +        If the type is <code>receive</code> then this is the
>>>>>> nexthop
>>>>>> ip we
>>>>>> +        from the outside.
>>>>>> +      </p>
>>>>>> +    </column>
>>>>>> +
>>>>>> +    <column name="tracked_port">
>>>>>> +      <p>
>>>>>> +        Only relevant for type <code>advertise</code>.
>>>>>> +
>>>>>> +        In combination with a host <code>ip_prefix</code>
>>>>>> this
>>>>>> tracks the port
>>>>>> +        OVN will forward the packets for this destination
>>>>>> to.
>>>>>> +
>>>>>> +        An announcing chassis can use this information to
>>>>>> check
>>>>>> if
>>>>>> this
>>>>>> +        destination is local and adjust the route priorities
>>>>>> based
>>>>>> on that.
>>>>>> +      </p>
>>>>>> +    </column>
>>>>>> +
>>>>>> +    <column name="type">
>>>>>> +      <p>
>>>>>> +        If the route is to be exported from OVN to the
>>>>>> outside
>>>>>> network or if
>>>>>> +        it is imported from the outside network.
>>>>>> +      </p>
>>>>>> +      <ul>
>>>>>> +        <li>
>>>>>> +          <code>advertise</code>: This route should be
>>>>>> advertised to
>>>>>> the
>>>>>> +          outside network.
>>>>>> +       </li>
>>>>>> +        <li>
>>>>>> +          <code>receive</code>: This route has been learned
>>>>>> from
>>>>>> the
>>>>>> outside
>>>>>> +          network.
>>>>>> +        </li>
>>>>>> +      </ul>
>>>>>> +    </column>
>>>>>> +
>>>>>> +    <column name="external_ids">
>>>>>> +      See <em>External IDs</em> at the beginning of this
>>>>>> document.
>>>>>> +    </column>
>>>>>> +  </table>
>>>>>>  </database>
>>>>>
>>>>> _______________________________________________
>>>>> dev mailing list
>>>>> [email protected]
>>>>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>>>
>>> _______________________________________________
>>> dev mailing list
>>> [email protected]
>>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
> 
> _______________________________________________
> dev mailing list
> [email protected]
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev

_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to