On Fri, Feb 7, 2025 at 1:09 PM Ilya Maximets <[email protected]> wrote:
>
> On 2/7/25 21:45, Ilya Maximets wrote:
> > On 2/7/25 15:01, Ilya Maximets wrote:
> >> On 2/7/25 05:31, Numan Siddique wrote:
> >>> On Thu, Feb 6, 2025 at 3:57 AM Ilya Maximets <[email protected]>
wrote:
> >>>>
> >>>> This patch set introduces ability to directly connect switches,
> >>>> including transit switches, in order to achieve higher total port
count
> >>>> and locality of changes in L2 topologies spread across multiple
> >>>> availability zones.  And while tailored for this use case, the
changes
> >>>> do not impose any limitations and should allow for all kinds of other
> >>>> different topologies.
> >>>>
> >>>> Amount of the logic code changes is relatively small, most of the
diff
> >>>> are new tests for the introduced functionality.
> >>>>
> >>>>
> >>>> Version 2:
> >>>>
> >>>>   * Rebased on top of latest changes on the main branch.
> >>>>   * Improved validation of the peer in northd. [Mark]
> >>>>   * Added a test for a switch port with an address set. [Mark]
> >>>>
> >>>>
> >>>> Ilya Maximets (2):
> >>>>   northd: Add support for spine-leaf logical switch topology.
> >>>>   ic: Add support for spine-leaf topology for transit switches.
> >>>
> >>> Hi Ilya,
> >>>
> >>> Thanks for adding this feature.  I applied both the patches to the
> >>> main.  I had to do a minor rebase.
> >>
> >> Thanks, Numan and Mark!
> >>
> >>>
> >>> I see one small issue  with the spine switch.  Since all the spine
> >>> switch ports have unknown address,
> >>> the packet will be cloned to N - 1 logical switches if N switches
connect to it.
> >>>
> >>> I think we should enable "fdb learn" in the spine switch with the
below changes
> >>>
> >>> -----------------------------------------------
> >>> diff --git a/northd/northd.c b/northd/northd.c
> >>> index 880112c3b9..e77936fbe9 100644
> >>> --- a/northd/northd.c
> >>> +++ b/northd/northd.c
> >>> @@ -5682,6 +5682,7 @@ build_lswitch_learn_fdb_op(
> >>>      ovs_assert(op->nbsp);
> >>>
> >>>      if (!op->n_ps_addrs && op->has_unknown &&
(!strcmp(op->nbsp->type, "") ||
> >>> +        !strcmp(op->nbsp->type, "switch") ||
> >>>          (lsp_is_localnet(op->nbsp) &&
localnet_can_learn_mac(op->nbsp)))) {
> >>>          ds_clear(match);
> >>>          ds_clear(actions);
> >>> -----------------------------------------------
> >>>
> >>> What do you think ?  Any concerns or objections ?   If not, can you
> >>> submit a follow up patch to enable this ?
> >>
> >> This seems reasonable.  I missed the part that we enable FDB
automatically
> >> for some ports with "unknown".  Will look into that and submit a fix.
> >
> > So, I gave this some more thoughts and I'm not sure if we actually need
> > FDB for "switch" ports in a common case.  In a non-IC setup OVN actually
> > knows the destination, if it's part of OVN network.  The whole
processing
> > of the spine switch will happen on the source node and only the egress
> > pipeline of the leaf switch will be executed on the destination node.
> > If we have actual VM ports with "unknown", then those will have the FDB
> > enabled, but we should not need to consult FDB for the spine switch
itself
> > otherwise.  Does that make sense?
>
> Thinking more about this again, there is still an issue if we have
"unknown"
> ports in leaf switches.  Since those switches are separate from the spine,
> will broadcast in the spine and then every leaf that has "unknown" will
also
> broadcast within itself.  And so, "unknown"  VM ports will receive traffic
> destined to ports on other leaf switches.  So, yes, we still need FDB for
> this case.
>
> >
> > OTOH, what we may actually need is FDB learning for remote ports in IC
> > setup.  In this case, OVN doesn't actually know the whole topology and
> > can't figure out to which of the remote ports the packet should go, so
it
> > will broadcast.  The problem here, however, is that FDB stages are in
the
> > ingress pipeline, and they will not be executed on the remote node that
> > actually needs them.  So, we may need to create FDB stages in the egress
> > pipeline for packets coming from remote ports.  Seems like a generic
> > issue with "unknown" ports on a transit switch.
> >
> > Mark, Numan, Dumitru, what do you think?
>
> But this issue also still stands as we need a way to learn MAC addresses
> from packet arriving from a remote port, so we either need FDB in the
egress
> pipeline of the transit switch for remote ports, or we need an external
> service like ovn-ic synchronizing FDB entries between AZs, which is not
good.
>

Hi Ilya, I agree with you that synchronizing FDB entries between AZs is a
really bad idea. FDB in the egress pipeline (and learning the mac for
inport of the spine switch) sounds reasonable.

Thanks,
Han

> >
> > Best regards, Ilya Maximets
> _______________________________________________
> dev mailing list
> [email protected]
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to