On Wed, Oct 2, 2019 at 9:11 AM Dumitru Ceara <dce...@redhat.com> wrote:
>
> On Tue, Oct 1, 2019 at 8:41 PM Han Zhou <zhou...@gmail.com> wrote:
> >
> >
> >
> > On Tue, Oct 1, 2019 at 3:34 AM Dumitru Ceara <dce...@redhat.com> wrote:
> > >
> > > Hi,
> > >
> > > We've hit a scaling issue recently [1] in the following topology:
> > >
> > > - External network connected to public logical switch "LS-pub"
> > > - ~300 logical networks (LR-i <--> LS-i <--> VMi) connected to LS-pub
> > > with dnat_and_snat rules.
> > >
> > > While trying to ping the VMs from outside the ARP request packet from
> > > the external host doesn't reach all the LR-i pipelines because it gets
> > > dropped due to "Too many resubmits".
> > >
> > > This happens because the MC_FLOOD group for LS-pub installs openflow
> > > entries that resubmit the packet:
> > > - through the patch ports towards all LR-i (one entry in table 32 with
> > > 300 resubmit actions).
> > > - to the egress pipeline for each VIF that's local to LS-pub (one
> > > entry in table 33).
> > >
> > > This means that for the ARP broadcast packet we'll execute the LR-i
> > > ingress/egress pipeline 300 times. For each execution we do a fair
> > > amount of resubmits through the different tables of the pipeline
> > > leading to a total number of resubmits for the single initial
> > > broadcast packet of more than 4K, the maximum allowed by OVS.
> > >
> > > After looking at the implementation I couldn't figure out a way to
> > > avoid running the full pipelines for each potential logical output
> > > port (patch or local VIF) because we might have flows later in the
> > > pipelines that perform actions based on the value of the logical
> > > output port (e.g., out ACL, qos).
> > >
> > > Do you think there's a different approach that we could use to
> > > implement flooding of broadcast/unknown unicast packets that would
> > > require less resubmit actions?
> > >
> > > This issue could also appear in a flat topology with a single logical
> > > switch and multiple VIFs (>300). In this case the resubmits would be
> > > part of the the openflow entry in table 33 but the result would be the
> > > same: too many resubmits due to the egress pipeline resubmits for each
> > > logical output port.
> > >
> > > Thanks,
> > > Dumitru
> > >
> > > [1] https://bugzilla.redhat.com/show_bug.cgi?id=1756945
> >
> > Thanks Dumitru for reporting this interesting problem.
>
> Hi Han,
>
> Thanks for the reply.
>
> > In theory, I don't think we can avoid this because different datapath
can have different actions for the same packet, so when a packet need to be
sent to multiple datapaths, we will have to let it go through each steps
following each datapath.
> > In practice, the number of stages to go through depends on the number
of output ports in the broadcast domain and the number of stages in each
output port. To alleviate the problem, we should try our best to handle
broadcast packets at the earliest stages of each datapath, and terminate
the packet pipeline as early as possible, e.g. ARP reponse, to reduce the
total number of resubmits.
> >
> > So I think what we can do are (although I hope there are better ways):
> > - Increase the limit in OVS for the maximum allowed resubmit,
considering the value for worst case reasonable real world deployment
>
> I'm afraid that increasing the limit will just postpone the problem
> for later. Also I'm worried about the time the packet would spend in
> the pipeline and the fact that potentially there will be high jitter
> between delivery times to different ports.
>

It should not be a big concern if fast-path flows are generated. But I am
not sure if that's the case for ARPing different IPs. Did you observe high
jitter with large number of ports in same L2?

> > - Try to avoid large broadcast domain in deployment (of course, only
when we have a choice)
>
> Agreed, but I guess this is outside the scope of OVN.
>
> > - See if there is optimization opportunity to move broadcast related
processing earlier in pipeline and complete as early as possible for the
processing for the packet.
> >
>
> I don't think the problem is broadcast specific. It should be the same
> for unknown unicast, right?
>
For unknown unicast, the packet will only be sent to the ports with
"unknown" address, which is usually the "localnet" port, which exists on
external logical switches and only one per LS, so I think that's not a
problem. (remember there is no flooding for "unknown" unicast in OVN,
unless someone configures multiple ports with "unknown". I am not aware of
such use case yet)

> > Thanks,
> > Han
>
> Thanks,
> Dumitru
_______________________________________________
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss

Reply via email to