> On 14-Mar-2024, at 9:07 PM, Dumitru Ceara <dce...@redhat.com> wrote:
> 
> On 3/14/24 15:21, Naveen Yerramneni wrote:
>> 
>> 
>>> On 08-Mar-2024, at 2:37 PM, Ales Musil <amu...@redhat.com> wrote:
>>> 
>>> 
>>> 
>>> On Wed, Mar 6, 2024 at 8:24 PM Naveen Yerramneni 
>>> <naveen.yerramn...@nutanix.com> wrote:
>>> 
>>> 
>>>> On 18-Dec-2023, at 8:53 PM, Dumitru Ceara <dce...@redhat.com> wrote:
>>>> 
>>>> On 12/18/23 16:17, Naveen Yerramneni wrote:
>>>>> 
>>>>> 
>>>>>> On 18-Dec-2023, at 7:26 PM, Dumitru Ceara <dce...@redhat.com> wrote:
>>>>>> 
>>>>>> On 11/30/23 16:32, Dumitru Ceara wrote:
>>>>>>> On 11/30/23 15:54, Naveen Yerramneni wrote:
>>>>>>>> 
>>>>>>>> 
>>>>>>>>> On 30-Nov-2023, at 6:06 PM, Dumitru Ceara <dce...@redhat.com> wrote:
>>>>>>>>> 
>>>>>>>>> On 11/30/23 09:45, Naveen Yerramneni wrote:
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>>> On 29-Nov-2023, at 2:24 PM, Dumitru Ceara <dce...@redhat.com> wrote:
>>>>>>>>>>> 
>>>>>>>>>>> On 11/29/23 07:45, naveen.yerramneni wrote:
>>>>>>>>>>>> This functionality can be enabled at the logical switch level:
>>>>>>>>>>>> - "other_config:fdb_local" can be used to enable/disable this
>>>>>>>>>>>> functionality, it is disabled by default.
>>>>>>>>>>>> - "other_config:fdb_local_idle_timeout" sepcifies idle timeout
>>>>>>>>>>>> for locally learned fdb flows, default timeout is 300 secs.
>>>>>>>>>>>> 
>>>>>>>>>>>> If enabled, below lflow is added for each port that has unknown 
>>>>>>>>>>>> addr set.
>>>>>>>>>>>> - table=2 (ls_in_lookup_fdb), priority=100, match=(inport == 
>>>>>>>>>>>> <in_port>),
>>>>>>>>>>>> action=(commit_fdb_local(timeout=<timeout>); next;
>>>>>>>>>>>> 
>>>>>>>>>>>> New OVN action: "commit_fdb_local". This sets following OVS action.
>>>>>>>>>>>> - 
>>>>>>>>>>>> learn(table=71,idle_timeout=<timeout>,delete_learned,OXM_OF_METADATA[],
>>>>>>>>>>>>      
>>>>>>>>>>>> NXM_OF_ETH_DST[]=NXM_OF_ETH_SRC[],load:NXM_NX_REG14[]->NXM_NX_REG15[])
>>>>>>>>>>>> 
>>>>>>>>>>>> This is useful when OVN is managing VLAN network that has multiple 
>>>>>>>>>>>> ports
>>>>>>>>>>>> set with unknown addr and localnet_learn_fdb is enabled. With this 
>>>>>>>>>>>> config,
>>>>>>>>>>>> if there is east-west traffic flowing between VMs part of same VLAN
>>>>>>>>>>>> deployed on different hypervisors then, MAC addrs of the source and
>>>>>>>>>>>> destination VMs keeps flapping between VM port and localnet port 
>>>>>>>>>>>> in 
>>>>>>>>>>>> Southbound FDB table. Enabling fdb_local config makes fdb table 
>>>>>>>>>>>> local to
>>>>>>>>>>>> the chassis and avoids MAC flapping.
>>>>>>>>>>>> 
>>>>>>>>>>>> Signed-off-by: Naveen Yerramneni <naveen.yerramn...@nutanix.com>
>>>>>>>>>>>> ---
>>>>>>>>>>> 
>>>>>>>>>>> Hi Naveen,
>>>>>>>>>>> 
>>>>>>>>>>> Thanks a lot for the patch!
>>>>>>>>>>> 
>>>>>>>>>>> Just a note, we already have a fix for the east-west traffic that 
>>>>>>>>>>> causes
>>>>>>>>>>> FDB flapping when localnet is used:
>>>>>>>>>>> 
>>>>>>>>>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_ovn-2Dorg_ovn_commit_2acf91e9628e9481c48e4a6cec8ad5159fdd6d2e&d=DwICaQ&c=s883GpUCOChKOHiocYtGcg&r=2PQjSDR7A28z1kXE1ptSm6X36oL_nCq1XxeEt7FkLmA&m=kPuq992rikXYk63APGxlIpfqY3lPpreN9f4ha9pZKpodnVgE9KfjEUNozpPUFzUu&s=LP9_zs2Rj34vMx20ntbu-A3taXqKMJNVH2TLQyOXCh0&e=
>>>>>>>>>>>  
>>>>>>>>>>> 
>>>>>>>>>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_ovn-2Dorg_ovn_commit_f3a14907fe2b1ecdcfddfbed595cd097b6efbe14&d=DwICaQ&c=s883GpUCOChKOHiocYtGcg&r=2PQjSDR7A28z1kXE1ptSm6X36oL_nCq1XxeEt7FkLmA&m=kPuq992rikXYk63APGxlIpfqY3lPpreN9f4ha9pZKpodnVgE9KfjEUNozpPUFzUu&s=gsUGtjyf9gSOr1LkcCH0O6MB1_tjXi9fuTgwEFgbRx8&e=
>>>>>>>>>>>  
>>>>>>>>>>> 
>>>>>>>>>>> In general, however, I think it's a very good idea to move the FDB 
>>>>>>>>>>> away
>>>>>>>>>>> from the Southbound and make it local to each hypervisor.  That 
>>>>>>>>>>> reduces
>>>>>>>>>>> load on the Southbound among other things.
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> Hi Dumitru,
>>>>>>>>>> 
>>>>>>>>>> Thanks for informing about the patches.
>>>>>>>>>> Yes, local FDB reduces load on southbound.
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>>>> include/ovn/actions.h |   7 +++
>>>>>>>>>>>> lib/actions.c         |  94 ++++++++++++++++++++++++++++++++++++
>>>>>>>>>>>> northd/northd.c       |  26 ++++++++++
>>>>>>>>>>>> ovn-nb.xml            |  14 ++++++
>>>>>>>>>>>> tests/ovn.at [ovn.at]          | 108 
>>>>>>>>>>>> ++++++++++++++++++++++++++++++++++++++++++
>>>>>>>>>>>> utilities/ovn-trace.c |   2 +
>>>>>>>>>>>> 6 files changed, 251 insertions(+)
>>>>>>>>>>>> 
>>>>>>>>>>>> diff --git a/include/ovn/actions.h b/include/ovn/actions.h
>>>>>>>>>>>> index 49cfe0624..85ac92cd3 100644
>>>>>>>>>>>> --- a/include/ovn/actions.h
>>>>>>>>>>>> +++ b/include/ovn/actions.h
>>>>>>>>>>>> @@ -127,6 +127,7 @@ struct collector_set_ids;
>>>>>>>>>>>> OVNACT(CHK_LB_AFF,        ovnact_result)          \
>>>>>>>>>>>> OVNACT(SAMPLE,            ovnact_sample)          \
>>>>>>>>>>>> OVNACT(MAC_CACHE_USE,     ovnact_null)            \
>>>>>>>>>>>> +    OVNACT(COMMIT_FDB_LOCAL,  ovnact_commit_fdb_local) \
>>>>>>>>>>>> 
>>>>>>>>>>>> /* enum ovnact_type, with a member OVNACT_<ENUM> for each action. 
>>>>>>>>>>>> */
>>>>>>>>>>>> enum OVS_PACKED_ENUM ovnact_type {
>>>>>>>>>>>> @@ -514,6 +515,12 @@ struct ovnact_commit_lb_aff {
>>>>>>>>>>>> uint16_t timeout;
>>>>>>>>>>>> };
>>>>>>>>>>>> 
>>>>>>>>>>>> +/* OVNACT_COMMIT_FBD_LOCAL. */
>>>>>>>>>>>> +struct ovnact_commit_fdb_local{
>>>>>>>>>>>> +    struct ovnact ovnact;
>>>>>>>>>>>> +    uint16_t timeout;  /* fdb_local flow timeout */
>>>>>>>>>>>> +};
>>>>>>>>>>>> +
>>>>>>>>>>>> /* Internal use by the helpers below. */
>>>>>>>>>>>> void ovnact_init(struct ovnact *, enum ovnact_type, size_t len);
>>>>>>>>>>>> void *ovnact_put(struct ofpbuf *, enum ovnact_type, size_t len);
>>>>>>>>>>>> diff --git a/lib/actions.c b/lib/actions.c
>>>>>>>>>>>> index a73fe1a1e..f5aa78db1 100644
>>>>>>>>>>>> --- a/lib/actions.c
>>>>>>>>>>>> +++ b/lib/actions.c
>>>>>>>>>>>> @@ -5236,6 +5236,98 @@ format_MAC_CACHE_USE(const struct 
>>>>>>>>>>>> ovnact_null *null OVS_UNUSED, struct ds *s)
>>>>>>>>>>>> ds_put_cstr(s, "mac_cache_use;");
>>>>>>>>>>>> }
>>>>>>>>>>>> 
>>>>>>>>>>>> +static void
>>>>>>>>>>>> +parse_commit_fdb_local(struct action_context *ctx,
>>>>>>>>>>>> +                     struct ovnact_commit_fdb_local *fdb_local)
>>>>>>>>>>>> +{
>>>>>>>>>>>> +    uint16_t timeout = 0;
>>>>>>>>>>>> +    lexer_force_match(ctx->lexer, LEX_T_LPAREN); /* Skip '('. */
>>>>>>>>>>>> +    if (!lexer_match_id(ctx->lexer, "timeout")) {
>>>>>>>>>>>> +        lexer_syntax_error(ctx->lexer, "invalid parameter");
>>>>>>>>>>>> +        return;
>>>>>>>>>>>> +    }
>>>>>>>>>>>> +    if (!lexer_force_match(ctx->lexer, LEX_T_EQUALS)) {
>>>>>>>>>>>> +        lexer_syntax_error(ctx->lexer, "invalid parameter");
>>>>>>>>>>>> +        return;
>>>>>>>>>>>> +    }
>>>>>>>>>>>> +    if (!action_parse_uint16(ctx, &timeout, "fdb_local flow 
>>>>>>>>>>>> timeout")) {
>>>>>>>>>>>> +        return;
>>>>>>>>>>>> +    }
>>>>>>>>>>>> +    fdb_local->timeout = timeout;
>>>>>>>>>>>> +    lexer_force_match(ctx->lexer, LEX_T_RPAREN); /* Skip ')'. */
>>>>>>>>>>>> +}
>>>>>>>>>>>> +
>>>>>>>>>>>> +static void
>>>>>>>>>>>> +format_COMMIT_FDB_LOCAL(const struct ovnact_commit_fdb_local 
>>>>>>>>>>>> *fdb_local,
>>>>>>>>>>>> +                      struct ds *s)
>>>>>>>>>>>> +{
>>>>>>>>>>>> +    ds_put_format(s, "commit_fdb_local(timeout=%u);", 
>>>>>>>>>>>> fdb_local->timeout);
>>>>>>>>>>>> +}
>>>>>>>>>>>> +
>>>>>>>>>>>> +static void
>>>>>>>>>>>> +ovnact_commit_fdb_local_free(struct ovnact_commit_fdb_local 
>>>>>>>>>>>> *fdb_local OVS_UNUSED)
>>>>>>>>>>>> +{
>>>>>>>>>>>> +}
>>>>>>>>>>>> +
>>>>>>>>>>>> +static void
>>>>>>>>>>>> +commit_fdb_local_learn_action(struct ovnact_commit_fdb_local 
>>>>>>>>>>>> *fdb_local,
>>>>>>>>>>>> +                        struct ofpbuf *ofpacts, uint32_t cookie)
>>>>>>>>>>>> +{
>>>>>>>>>>>> +    struct ofpact_learn *ol = ofpact_put_LEARN(ofpacts);
>>>>>>>>>>>> +    struct match match = MATCH_CATCHALL_INITIALIZER;
>>>>>>>>>>>> +    struct ofpact_learn_spec *ol_spec;
>>>>>>>>>>>> +    unsigned int imm_bytes;
>>>>>>>>>>>> +    uint8_t *src_imm;
>>>>>>>>>>>> +
>>>>>>>>>>>> +    ol->flags = NX_LEARN_F_DELETE_LEARNED;
>>>>>>>>>>>> +    ol->idle_timeout = fdb_local->timeout;
>>>>>>>>>>>> +    ol->hard_timeout = OFP_FLOW_PERMANENT;
>>>>>>>>>>>> +    ol->priority = OFP_DEFAULT_PRIORITY;
>>>>>>>>>>>> +    ol->table_id = OFTABLE_GET_FDB;
>>>>>>>>>>>> +    ol->cookie = htonll(cookie);
>>>>>>>>>>>> +
>>>>>>>>>>>> +    /* Match on metadata of the packet that created the new 
>>>>>>>>>>>> table. */
>>>>>>>>>>>> +    ol_spec = ofpbuf_put_zeros(ofpacts, sizeof *ol_spec);
>>>>>>>>>>>> +    ol_spec->dst.field = mf_from_id(MFF_METADATA);
>>>>>>>>>>>> +    ol_spec->dst.ofs = 0;
>>>>>>>>>>>> +    ol_spec->dst.n_bits = ol_spec->dst.field->n_bits;
>>>>>>>>>>>> +    ol_spec->n_bits = ol_spec->dst.n_bits;
>>>>>>>>>>>> +    ol_spec->dst_type = NX_LEARN_DST_MATCH;
>>>>>>>>>>>> +    ol_spec->src_type = NX_LEARN_SRC_FIELD;
>>>>>>>>>>>> +    ol_spec->src.field = mf_from_id(MFF_METADATA);
>>>>>>>>>>>> +
>>>>>>>>>>>> +    /* Match on metadata of the packet. */
>>>>>>>>>>>> +    ol_spec = ofpbuf_put_zeros(ofpacts, sizeof *ol_spec);
>>>>>>>>>>>> +    ol_spec->dst.field = mf_from_id(MFF_ETH_DST);
>>>>>>>>>>>> +    ol_spec->dst.ofs = 0;
>>>>>>>>>>>> +    ol_spec->dst.n_bits = ol_spec->dst.field->n_bits;
>>>>>>>>>>>> +    ol_spec->n_bits = ol_spec->dst.n_bits;
>>>>>>>>>>>> +    ol_spec->dst_type = NX_LEARN_DST_MATCH;
>>>>>>>>>>>> +    ol_spec->src_type = NX_LEARN_SRC_FIELD;
>>>>>>>>>>>> +    ol_spec->src.field = mf_from_id(MFF_ETH_SRC);
>>>>>>>>>>>> +
>>>>>>>>>>>> +
>>>>>>>>>>>> +    /* Load MFF_LOG_OUTPORT from MFF_IN_PORT. */
>>>>>>>>>>>> +    ol_spec = ofpbuf_put_zeros(ofpacts, sizeof *ol_spec);
>>>>>>>>>>>> +    ol_spec->dst.field = mf_from_id(MFF_LOG_OUTPORT);
>>>>>>>>>>>> +    ol_spec->dst.ofs = 0;
>>>>>>>>>>>> +    ol_spec->dst.n_bits = ol_spec->dst.field->n_bits;
>>>>>>>>>>>> +    ol_spec->n_bits = ol_spec->dst.n_bits;
>>>>>>>>>>>> +    ol_spec->dst_type = NX_LEARN_DST_LOAD;
>>>>>>>>>>>> +    ol_spec->src_type = NX_LEARN_SRC_FIELD;
>>>>>>>>>>>> +    ol_spec->src.field = mf_from_id(MFF_LOG_INPORT);
>>>>>>>>>>>> +
>>>>>>>>>>>> +    ofpact_finish_LEARN(ofpacts, &ol);
>>>>>>>>>>>> +}
>>>>>>>>>>> 
>>>>>>>>>>> A difference from today's SB.FDB centralized approach is that when
>>>>>>>>>>> ovn-controller restarts these flows will be cleared, I think.
>>>>>>>>>>> 
>>>>>>>>>>> Are we OK with that?  I think so but if not what are the options to
>>>>>>>>>>> avoid clearing the local fdb cache on restart?
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> OVS has to relearn the FDB  flows whenever tables are cleared.
>>>>>>>>>> During this time, packets gets flooded. I need to think about 
>>>>>>>>>> possible 
>>>>>>>>>> options if we want to retain FDB table.
>>>>>>>>>> 
>>>>>>>>>> Can we take this up as an enhancement in a separate patch
>>>>>>>>>> once we identify a solution for this ?
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> Sounds good to me.
>>>>>>>>> 
>>>>>>>>>>> Another difference with today's approach is that this avoids a
>>>>>>>>>>> controller action, that's great!
>>>>>>>>>>> 
>>>>>>>>>>>> +
>>>>>>>>>>>> +static void
>>>>>>>>>>>> +encode_COMMIT_FDB_LOCAL(const struct ovnact_commit_fdb_local 
>>>>>>>>>>>> *fdb_local,
>>>>>>>>>>>> +                      const struct ovnact_encode_params *ep,
>>>>>>>>>>>> +                      struct ofpbuf *ofpacts)
>>>>>>>>>>>> +{
>>>>>>>>>>>> +     commit_fdb_local_learn_action(fdb_local, ofpacts, 
>>>>>>>>>>>> ep->lflow_uuid.parts[0]);
>>>>>>>>>>>> +}
>>>>>>>>>>>> +
>>>>>>>>>>>> static void
>>>>>>>>>>>> encode_MAC_CACHE_USE(const struct ovnact_null *null OVS_UNUSED,
>>>>>>>>>>>>                  const struct ovnact_encode_params *ep,
>>>>>>>>>>>> @@ -5451,6 +5543,8 @@ parse_action(struct action_context *ctx)
>>>>>>>>>>>>     parse_sample(ctx);
>>>>>>>>>>>> } else if (lexer_match_id(ctx->lexer, "mac_cache_use")) {
>>>>>>>>>>>>     ovnact_put_MAC_CACHE_USE(ctx->ovnacts);
>>>>>>>>>>>> +    } else if (lexer_match_id(ctx->lexer, "commit_fdb_local")) {
>>>>>>>>>>>> +        parse_commit_fdb_local(ctx, 
>>>>>>>>>>>> ovnact_put_COMMIT_FDB_LOCAL(ctx->ovnacts));
>>>>>>>>>>>> } else {
>>>>>>>>>>>>     lexer_syntax_error(ctx->lexer, "expecting action");
>>>>>>>>>>>> }
>>>>>>>>>>>> diff --git a/northd/northd.c b/northd/northd.c
>>>>>>>>>>>> index d1465ddf7..de18694a0 100644
>>>>>>>>>>>> --- a/northd/northd.c
>>>>>>>>>>>> +++ b/northd/northd.c
>>>>>>>>>>>> @@ -1834,6 +1834,12 @@ localnet_can_learn_mac(const struct 
>>>>>>>>>>>> nbrec_logical_switch_port *nbsp)
>>>>>>>>>>>> return smap_get_bool(&nbsp->options, "localnet_learn_fdb", false);
>>>>>>>>>>>> }
>>>>>>>>>>>> 
>>>>>>>>>>>> +static bool
>>>>>>>>>>>> +ls_is_fdb_local(const struct nbrec_logical_switch *nbs)
>>>>>>>>>>>> +{
>>>>>>>>>>>> +    return smap_get_bool(&nbs->other_config, "fdb_local", false);
>>>>>>>>>>>> +}
>>>>>>>>>>>> +
>>>>>>>>>>> 
>>>>>>>>>>> Personally, I'd prefer if we don't add another config knob and we 
>>>>>>>>>>> just
>>>>>>>>>>> make this the only way FDB works.  We could also document that the 
>>>>>>>>>>> FDB
>>>>>>>>>>> SB table should be deprecated.
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> If we want to make local FDB as default then, I think we need to 
>>>>>>>>>> handle overlay 
>>>>>>>>>> use case as well. Probably, we might have to add a new stage in 
>>>>>>>>>> logical switch
>>>>>>>>>> egress pipeline to learn FDB entries for packets coming over tunnel 
>>>>>>>>>> (or)
>>>>>>>>>> something similar.
>>>>>>>>>> 
>>>>>>>>>> Can we take this up in a separate patch ?
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> Given that we have a fix already for the original problem you were
>>>>>>>>> trying to address I would prefer that we avoid adding new config knobs
>>>>>>>>> and handle both the localnet and overlay cases at the same time.
>>>>>>>>> 
>>>>>>>> 
>>>>>>>> Sure.
>>>>>>>> 
>>>>>>>>> OTOH, why is there a difference?
>>>>>>>>> 
>>>>>>>>> The learned flow loads MFF_LOG_INPORT (from the packet that triggers 
>>>>>>>>> the
>>>>>>>>> learn() action) into MFF_LOG_INPORT:
>>>>>>>>> 
>>>>>>>>> With Geneve and STT overlay MFF_LOG_INPORT is still correctly set.  
>>>>>>>>> With
>>>>>>>>> VXLAN that's not the case but that's already a documented limitation,
>>>>>>>>> LOG_INPORT is not available after VXLAN tunneling therefore features
>>>>>>>>> that need it (like egress ACLs matching against ingress port
>>>>>>>>> identifiers) are not supported:
>>>>>>>>> 
>>>>>>>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_ovn-2Dorg_ovn_blob_main_ovn-2Darchitecture.7.xml-23L2842&d=DwIDaQ&c=s883GpUCOChKOHiocYtGcg&r=2PQjSDR7A28z1kXE1ptSm6X36oL_nCq1XxeEt7FkLmA&m=LO8-TXDG00EgnhB_oBbKGhuFWEv1FgwWbp-oMgCaMjsrN1ow1XIgiNdpeQ2FpNyQ&s=AyiJd4NaZd3P4HGvFbMO9No0cO8bux35PJNhtr8ZGe4&e=
>>>>>>>>>  
>>>>>>>>> 
>>>>>>>> 
>>>>>>>> In case of overlay, ingress and egress pipelines are processed on 
>>>>>>>> different nodes
>>>>>>>> assuming source and dest VMs are on different nodes. I think this 
>>>>>>>> makes local FDB
>>>>>>>> to learn only local VMs MACs and it never learn remote VMs MACs.
>>>>>>>> 
>>>>>>>> Example: 
>>>>>>>> 2 VMs with unknown addr set are connected to same logical switch where 
>>>>>>>> vm1 is on hv1
>>>>>>>> and vm2 is on hv2. When vm1 is sending ICMP req packet to vm2, ingress 
>>>>>>>> pipeline happens
>>>>>>>> on hv1 and it learns vm1-mac and  this packet gets flooded since 
>>>>>>>> vm2-mac is not yet learned.
>>>>>>>> Packet reaches hv2 over tunnel and egress pipeline is exercised on 
>>>>>>>> hv2. Now, when vm2 is
>>>>>>>> responding back, ingress pipeline happens on hv2 and it learns vm2-mac 
>>>>>>>> and this packet also
>>>>>>>> gets flooded since vm1-mac is not learnt on hv2 when ICMP req packet 
>>>>>>>> is received.
>>>>>>>> 
>>>>>>>> 
>>>>>>> 
>>>>>>> Ah, I see your point, you're right.  So it does look like we'd need a
>>>>>>> dedicated fdb learning stage in the egress pipeline.  But that is
>>>>>>> probably not that terrible.
>>>>>>> 
>>> 
>>> 
>>> Hi Dumitru,
>>> 
> 
> Hi Naveen,
> 
>>> I started working on this patch, sorry for the delay.
>>> 
> 
> Thanks for following up!
> 
>>> On further investigation, I realised that we cannot permanently remove FDB 
>>> table from southbound even
>>> if we add new FDB leaning stage in egress pipeline.
>>> 
>>> Following is the example:
>>> 
>>> Two VMs part of different overlay subnets have unknown addr set whereVM1 is 
>>> connected to 
>>> subnet-1 (LS1) and is running on hv1  and VM2 isconnected to subnet-2 (LS2) 
>>> and is running
>>> on hv2. Both subnets are connected to same logical router(LR).
>>> 
>>> When VM1 is sending ICMP packet to VM2, pipeline stages gets processed as 
>>> below.
>>>  - ICMP Req on HV1: LS1-in -> LR-in -> LR-out -> LS2-in. HV1 learns VM1 
>>> MAC, packet gets
>>>    flooded in LS2.
>>>  - ICMP Req on HV2: LS2-out.
>>>  - ICMP Resp on HV2: LS2-in -> LR-in -> LR-out -> LS1-in.  HV2 learns VM2 
>>> MAC, packet gets
>>>    flooded in LS1.
>>>  - ICMP Resp on HV1: LS1-out.
>>>  - Every packet gets flooded since HV1 never learns VM2 MAC and HV2 never 
>>> learns VM1 MAC.
>>> 
>>> 
>>> I am thinking either we can make FDB local implementation only applicable 
>>> for VLAN subnets with default disabled
>>> (or) implement is for both VLAN and overlay subnets with default disabled 
>>> by documenting the limitation and
>>> explore the possibility of ignoring this setting in northd when multiple 
>>> overlay subnets connected to same LR
>>> have ports with unknown addr configured.
>>> 
>>> Please let me know your thoughts.
> 
> My first impression is that it might become confusing to users if the
> FDB local implementation is only applicablel for VLAN subnets.  I'd
> prefer to avoid having two different implementations (centralized - in
> SB - vs decentralized - local).
> 
>>> 
>>> Thanks,
>>> Naveen
>>> 
>>>>>>>>> One more thing we need to take care of in order to be able to make 
>>>>>>>>> local
>>>>>>>>> FDB the default is "FDB refresh":
>>>>>>>>> 
>>>>>>>>> 551527a5e68e ("controller: Update FDB timestamp")
>>>>>>>>> 
>>>>>>>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_ovn-2Dorg_ovn_commit_551527a5e68e7233ad80d212d549df98f13e37bc&d=DwIDaQ&c=s883GpUCOChKOHiocYtGcg&r=2PQjSDR7A28z1kXE1ptSm6X36oL_nCq1XxeEt7FkLmA&m=LO8-TXDG00EgnhB_oBbKGhuFWEv1FgwWbp-oMgCaMjsrN1ow1XIgiNdpeQ2FpNyQ&s=dNv1oya7S00oKbz1E9i5IyhqDOsEajuutcKSV6bIAfs&e=
>>>>>>>>>  
>>>>>>>>> 
>>>>>>>> 
>>>>>>>> Sure.
>>>>>>>> 
>>>>>> 
>>>>>> Hi Naveen,
>>>>>> 
>>>>>> Just touching base on this again, I was curious if you're still planning
>>>>>> to work on making the FDB table local; I still think it's beneficial in
>>>>>> general.
>>>>>> 
>>>>>> Thanks,
>>>>>> Dumitru
>>>>> 
>>>>> 
>>>>> Hi Dumitru,
>>>>> 
>>>>> Yes, I will work on this. I will try to send out the patch by Jan 1st 
>>>>> week.
>>>>> 
>>>> 
>>>> Awesome, thanks Naveen!  No rush though, I just wanted to confirm
>>>> whether you were interested in pursuing this.
>>>> 
>>>> Best regards,
>>>> Dumitru
>>>> 
>>>> 
>>> 
>>> 
>>> Hi Naveen,
>>> 
>>> I did run into a similar issue with MAC binding, with distributed routers 
>>> there is currently no other way to share the MAC binding. That made me 
>>> wonder if it's worth pursuing, as the complexity of the code would 
>>> significantly increase having half in-memory, half DB solution.
>>> With the aging option the DB size shouldn't be that problematic anymore not 
>>> sure if there are any other concerns with the current implementation. 
>>> 
>>> From my point of view this might be a dead end, however others might have 
>>> slightly different opinions.
>>> 
>>> Thanks,
>>> Ales
>>> 
>>> -- 
>>> Ales Musil
>>> Senior Software Engineer - OVN Core
>>> Red Hat EMEA [redhat.com]
>>> amu...@redhat.com 
>>>  [red.ht]
>> 
>> 
>> Hi Ales,
>> 
>> Thanks for the reply.
>> 
>> I definitely agree with you on the additional code complexity that we have 
>> to deal with when we have both implementations.
>> I think this FDB local helps to reduce the overhead on SB in case of VLAN 
>> networks depending on the size of the network. 
>> 
>> Also, one more issue we have noticed with FDB  table when fdb learning is 
>> enabled on localnet port:
>>   -  If underlay TOR switch floods some packets then, DB conflicts (OVSDB 
>> reports constraint violation errors) are occuring
>>      when controllers are trying to insert entries to FDB table. This is 
>> happening because when underlay TOR switch floods
>>      a packet then, all controllers receives packets at the same time and 
>> try to install FDB entry at the same time. Only first one
>>      succeeds and others report error. This cause controllers to recompute 
>> all the flows from SB. On scale networks, this is making
>>      OVN controllers busy.
>> 
> 
> Can't we address this in a similar way to what we did for MAC_Bindings
> and add some sort of randomness to when the SB write happens, e.g.:
> 
> b416f6f65d71 ("controller: Add delay after multicast ARP packet")
> 9411dd300d5a ("mac-learn: Make the mac_binding struct more flexible")
> 
> https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_ovn-2Dorg_ovn_commit_b416f6f65d71&d=DwIDaQ&c=s883GpUCOChKOHiocYtGcg&r=2PQjSDR7A28z1kXE1ptSm6X36oL_nCq1XxeEt7FkLmA&m=c0ceNH31esimVUy0sy65PbuIkZ0t3-hk9pq8tlyPKdZClTRxKXF0G2yiKgxtVzpU&s=sEl-2YZAwckbT5tjW6O64qWcD2iaCGy1fdcg53CQs44&e=
>  
> https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_ovn-2Dorg_ovn_commit_9411dd300d5a&d=DwIDaQ&c=s883GpUCOChKOHiocYtGcg&r=2PQjSDR7A28z1kXE1ptSm6X36oL_nCq1XxeEt7FkLmA&m=c0ceNH31esimVUy0sy65PbuIkZ0t3-hk9pq8tlyPKdZClTRxKXF0G2yiKgxtVzpU&s=F-0lA-Wm0hPorf8JdI8chPm_nYff8IXatt4F6hE5qpA&e=
>  

Hi Dumitru,

Thanks for pointing me to these, I will go through this and get back.

> 
>> If we are ok to support FDB local implementation only for VLAN subnets then, 
>> I can look at possible options to reduce the code complexity.
> 
> Like I mentioned above, if we can address existing issues in the
> centralized (in SB) implementation, I'd prefer avoiding having two
> different implementations.

ACK

> 
>> Any other suggestions are welcome.
>> 
>> 
>> Thanks,
>> Naveen
> 
> Regards,
> Dumitru

_______________________________________________
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to