Re: [ovs-discuss] [OVN] flow explosion in lr_in_arp_resolve table

Numan Siddique Thu, 07 May 2020 23:39:08 -0700

On Fri, May 8, 2020 at 11:54 AM Han Zhou <zhou...@gmail.com> wrote:

> (Add the MLs back)
>
> On Thu, May 7, 2020 at 4:01 PM Girish Moodalbail <gmoodalb...@gmail.com>
> wrote:
>
>> Hello Han,
>>
>> Sorry, I was monitoring the ovn-kubernetes google group and didn't see
>> your emails till now.
>>
>>
>>>
>>> On the other hand, why wouldn't splitting the join logical switch to
>>> 1000 LSes solve the problem? I understand that there will be 1000 more
>>> datapaths, and 1000 more LRPs, but these are all O(n), which is much more
>>> efficient than the O(n^2) exploding. What's the other scale issues created
>>> by this?
>>>
>>
>> Splitting a single join logical switch into 1000 different logical switch
>> is how I have resolved the problem now. However, with this design I see
>> following issues.
>> (1) Complexity
>>    where one logical switch should have sufficed, we now need to create
>> 1000 logical switches just to workaround the O(n^2) logical flows
>> (2) IPAM management
>>   - before I had one IP subnet 100.64.0.0/16 for the single logical
>> switch and depended on OVN IPAM to allocate IPs off of that subnet
>>   - now I need to first do subnet management (break a /16 to /29 CIDR) in
>> OVN K8s and then assign each subnet to each of the join logical switch
>> (3) each of this join logical switch is a distributed switch. The flows
>> related to each one of them will be present in each hypervisor. This will
>> increase the number of OpenFlow flows  However, from OVN K8s point of view
>> this logical switch is essentially pinned to an hypervisor and its role is
>> to connect the hypervisor's l3gateway to the distributed router.
>>
>> We are trying to simplify the OVN logical topology for OVN K8s so that
>> the number of logical flows (and therefore the number of OpenFlow flows)
>> are reduced and that reduces the pressure on ovn-northd, OVN SB DB, and
>> finally ovn-controller processes.
>>
>> Every node in OVN K8s cluster adds 4 resources. So, in a 1000 node
>> k8s-cluster we will have 4000 + 1 (distributed router). This ends up
>> creating around 250K OpenFlow rules in each of the hypervisior. This number
>> is to just support the initial logical topology. I am not accounting for
>> any flows that will be generated for k8s network polices, services, and so
>> on.
>>
>>
>>>
>>> In addition, Girish, for the external LS, I am not sure why can't it be
>>> shared, if all the nodes are connected to a single L2 network. (If they are
>>> connected to separate L2 networks, different external LSes should be
>>> created, at least according to current OVN model).
>>>
>>
>> Yes, the plan was to share the same external LS with all of the L3
>> gateway routers since they are all on the same broadcast domain. However,
>> we will end up with the same 2M logical flows since a single external LS
>> connects all the L3 gateway routers on the same broadcast domain.
>>
>> In short, for a 1000-node K8s cluster, if we reduce the logical flow
>> explosion, then we can reduce the number of logical resources in OVN K8s
>> topology by 1998  (1000 Join LS will become 1 and 1000 external LS will
>> become 1).
>>
>>
> Ok, so now we are not satisfied with even O(n), and instead we want to
> make it O(1) for some of the resources.
> I think the major problem is the per-node gateway routers. It seems not
> really necessary in theory. Ideally the topology can be simplified with the
> concept of distributed gateway ports, on a single logical router (the join
> router), and then we can remove all the join LSes and gateway routers,
> something like below:
>
>     +------------------------------------------+
>     |        external logical switch           |
>     +-+-------------+--------------------+-----+
>       |             |                    |
> +-----+-----+ +-----------+        +-----+-----------+
> | dgp1@node1| | dgp2@node2|   ...  |dgp1000@node1000 |
> +-----+-----+ +-----+-----+        +-----+-----------+
>       |             |                    |
>     +-+-------------+--------------------+-----+
>     |             logical router               |
>     +------------------------------------------+
>
> (dgp = distributed gateway port)
>
> This way, you only need one router, and also one external logical switch,
> and there won't be the O(n^2) flow exploding problem for ARP resolving
> because you have 1 LR only. The number of logical routers and switches
> become O(1). The number of router ports are still O(n), but it is also
> halved.
>
> In reality, there are some problems of this solution that need to be
> addressed.
>
> Firstly, it would require some change in OVN because currently OVN has a
> limitation that each LR can only have one gateway router port. However, it
> doesn't seem to be anything fundamental that would prevent us from removing
> that restriction to support multiple distributed gateway ports on a single
> LR. I'd like to hear from more OVN folks in case there is some reason we
> shouldn't do this.
>
>
I'd be happy if ovn-kube makes use of logical routers with distributed
gateway ports and avoids the transit logic switches.
I'm fine with adding multiple distributed gateway ports.



Thanks
Numan



> The other thing that I am not so sure is about connecting the logical
> router to the external logical switch through multiple ports. This means we
> will have multiple ports of the logical router on the same subnet, which is
> something we usually don't do traditionally. However, I think maybe this
> will work with OVN static route with src routing and output_port specified
> so that the LR know which port (and chassis) to send the traffic out,
> provided that there is only one nexthop, which is the default external GW.
> If multiple nexthops need to be supported, this won't work (and we probably
> will have to look at the solution that avoids the static neighbour table
> population).
>
> Thanks,
> Han
>
> --
> You received this message because you are subscribed to the Google Groups
> "ovn-kubernetes" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to ovn-kubernetes+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/ovn-kubernetes/CADtzDC%3Dp4fmsQPY38eezAqENG65ftXk6CAxKn%3DsF1X%3Dp92gw0A%40mail.gmail.com
> <https://groups.google.com/d/msgid/ovn-kubernetes/CADtzDC%3Dp4fmsQPY38eezAqENG65ftXk6CAxKn%3DsF1X%3Dp92gw0A%40mail.gmail.com?utm_medium=email&utm_source=footer>
> .
>

_______________________________________________
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss

Re: [ovs-discuss] [OVN] flow explosion in lr_in_arp_resolve table

Reply via email to