> On 18-Dec-2023, at 8:53 PM, Dumitru Ceara <dce...@redhat.com> wrote:
> 
> On 12/18/23 16:17, Naveen Yerramneni wrote:
>> 
>> 
>>> On 18-Dec-2023, at 7:26 PM, Dumitru Ceara <dce...@redhat.com> wrote:
>>> 
>>> On 11/30/23 16:32, Dumitru Ceara wrote:
>>>> On 11/30/23 15:54, Naveen Yerramneni wrote:
>>>>> 
>>>>> 
>>>>>> On 30-Nov-2023, at 6:06 PM, Dumitru Ceara <dce...@redhat.com> wrote:
>>>>>> 
>>>>>> On 11/30/23 09:45, Naveen Yerramneni wrote:
>>>>>>> 
>>>>>>> 
>>>>>>>> On 29-Nov-2023, at 2:24 PM, Dumitru Ceara <dce...@redhat.com> wrote:
>>>>>>>> 
>>>>>>>> On 11/29/23 07:45, naveen.yerramneni wrote:
>>>>>>>>> This functionality can be enabled at the logical switch level:
>>>>>>>>> - "other_config:fdb_local" can be used to enable/disable this
>>>>>>>>> functionality, it is disabled by default.
>>>>>>>>> - "other_config:fdb_local_idle_timeout" sepcifies idle timeout
>>>>>>>>> for locally learned fdb flows, default timeout is 300 secs.
>>>>>>>>> 
>>>>>>>>> If enabled, below lflow is added for each port that has unknown addr 
>>>>>>>>> set.
>>>>>>>>> - table=2 (ls_in_lookup_fdb), priority=100, match=(inport == 
>>>>>>>>> <in_port>),
>>>>>>>>> action=(commit_fdb_local(timeout=<timeout>); next;
>>>>>>>>> 
>>>>>>>>> New OVN action: "commit_fdb_local". This sets following OVS action.
>>>>>>>>> - 
>>>>>>>>> learn(table=71,idle_timeout=<timeout>,delete_learned,OXM_OF_METADATA[],
>>>>>>>>>       
>>>>>>>>> NXM_OF_ETH_DST[]=NXM_OF_ETH_SRC[],load:NXM_NX_REG14[]->NXM_NX_REG15[])
>>>>>>>>> 
>>>>>>>>> This is useful when OVN is managing VLAN network that has multiple 
>>>>>>>>> ports
>>>>>>>>> set with unknown addr and localnet_learn_fdb is enabled. With this 
>>>>>>>>> config,
>>>>>>>>> if there is east-west traffic flowing between VMs part of same VLAN
>>>>>>>>> deployed on different hypervisors then, MAC addrs of the source and
>>>>>>>>> destination VMs keeps flapping between VM port and localnet port in 
>>>>>>>>> Southbound FDB table. Enabling fdb_local config makes fdb table local 
>>>>>>>>> to
>>>>>>>>> the chassis and avoids MAC flapping.
>>>>>>>>> 
>>>>>>>>> Signed-off-by: Naveen Yerramneni <naveen.yerramn...@nutanix.com>
>>>>>>>>> ---
>>>>>>>> 
>>>>>>>> Hi Naveen,
>>>>>>>> 
>>>>>>>> Thanks a lot for the patch!
>>>>>>>> 
>>>>>>>> Just a note, we already have a fix for the east-west traffic that 
>>>>>>>> causes
>>>>>>>> FDB flapping when localnet is used:
>>>>>>>> 
>>>>>>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_ovn-2Dorg_ovn_commit_2acf91e9628e9481c48e4a6cec8ad5159fdd6d2e&d=DwICaQ&c=s883GpUCOChKOHiocYtGcg&r=2PQjSDR7A28z1kXE1ptSm6X36oL_nCq1XxeEt7FkLmA&m=kPuq992rikXYk63APGxlIpfqY3lPpreN9f4ha9pZKpodnVgE9KfjEUNozpPUFzUu&s=LP9_zs2Rj34vMx20ntbu-A3taXqKMJNVH2TLQyOXCh0&e=
>>>>>>>>  
>>>>>>>> 
>>>>>>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_ovn-2Dorg_ovn_commit_f3a14907fe2b1ecdcfddfbed595cd097b6efbe14&d=DwICaQ&c=s883GpUCOChKOHiocYtGcg&r=2PQjSDR7A28z1kXE1ptSm6X36oL_nCq1XxeEt7FkLmA&m=kPuq992rikXYk63APGxlIpfqY3lPpreN9f4ha9pZKpodnVgE9KfjEUNozpPUFzUu&s=gsUGtjyf9gSOr1LkcCH0O6MB1_tjXi9fuTgwEFgbRx8&e=
>>>>>>>>  
>>>>>>>> 
>>>>>>>> In general, however, I think it's a very good idea to move the FDB away
>>>>>>>> from the Southbound and make it local to each hypervisor.  That reduces
>>>>>>>> load on the Southbound among other things.
>>>>>>>> 
>>>>>>> 
>>>>>>> Hi Dumitru,
>>>>>>> 
>>>>>>> Thanks for informing about the patches.
>>>>>>> Yes, local FDB reduces load on southbound.
>>>>>>> 
>>>>>>> 
>>>>>>>>> include/ovn/actions.h |   7 +++
>>>>>>>>> lib/actions.c         |  94 ++++++++++++++++++++++++++++++++++++
>>>>>>>>> northd/northd.c       |  26 ++++++++++
>>>>>>>>> ovn-nb.xml            |  14 ++++++
>>>>>>>>> tests/ovn.at          | 108 ++++++++++++++++++++++++++++++++++++++++++
>>>>>>>>> utilities/ovn-trace.c |   2 +
>>>>>>>>> 6 files changed, 251 insertions(+)
>>>>>>>>> 
>>>>>>>>> diff --git a/include/ovn/actions.h b/include/ovn/actions.h
>>>>>>>>> index 49cfe0624..85ac92cd3 100644
>>>>>>>>> --- a/include/ovn/actions.h
>>>>>>>>> +++ b/include/ovn/actions.h
>>>>>>>>> @@ -127,6 +127,7 @@ struct collector_set_ids;
>>>>>>>>>  OVNACT(CHK_LB_AFF,        ovnact_result)          \
>>>>>>>>>  OVNACT(SAMPLE,            ovnact_sample)          \
>>>>>>>>>  OVNACT(MAC_CACHE_USE,     ovnact_null)            \
>>>>>>>>> +    OVNACT(COMMIT_FDB_LOCAL,  ovnact_commit_fdb_local) \
>>>>>>>>> 
>>>>>>>>> /* enum ovnact_type, with a member OVNACT_<ENUM> for each action. */
>>>>>>>>> enum OVS_PACKED_ENUM ovnact_type {
>>>>>>>>> @@ -514,6 +515,12 @@ struct ovnact_commit_lb_aff {
>>>>>>>>>  uint16_t timeout;
>>>>>>>>> };
>>>>>>>>> 
>>>>>>>>> +/* OVNACT_COMMIT_FBD_LOCAL. */
>>>>>>>>> +struct ovnact_commit_fdb_local{
>>>>>>>>> +    struct ovnact ovnact;
>>>>>>>>> +    uint16_t timeout;  /* fdb_local flow timeout */
>>>>>>>>> +};
>>>>>>>>> +
>>>>>>>>> /* Internal use by the helpers below. */
>>>>>>>>> void ovnact_init(struct ovnact *, enum ovnact_type, size_t len);
>>>>>>>>> void *ovnact_put(struct ofpbuf *, enum ovnact_type, size_t len);
>>>>>>>>> diff --git a/lib/actions.c b/lib/actions.c
>>>>>>>>> index a73fe1a1e..f5aa78db1 100644
>>>>>>>>> --- a/lib/actions.c
>>>>>>>>> +++ b/lib/actions.c
>>>>>>>>> @@ -5236,6 +5236,98 @@ format_MAC_CACHE_USE(const struct ovnact_null 
>>>>>>>>> *null OVS_UNUSED, struct ds *s)
>>>>>>>>>  ds_put_cstr(s, "mac_cache_use;");
>>>>>>>>> }
>>>>>>>>> 
>>>>>>>>> +static void
>>>>>>>>> +parse_commit_fdb_local(struct action_context *ctx,
>>>>>>>>> +                     struct ovnact_commit_fdb_local *fdb_local)
>>>>>>>>> +{
>>>>>>>>> +    uint16_t timeout = 0;
>>>>>>>>> +    lexer_force_match(ctx->lexer, LEX_T_LPAREN); /* Skip '('. */
>>>>>>>>> +    if (!lexer_match_id(ctx->lexer, "timeout")) {
>>>>>>>>> +        lexer_syntax_error(ctx->lexer, "invalid parameter");
>>>>>>>>> +        return;
>>>>>>>>> +    }
>>>>>>>>> +    if (!lexer_force_match(ctx->lexer, LEX_T_EQUALS)) {
>>>>>>>>> +        lexer_syntax_error(ctx->lexer, "invalid parameter");
>>>>>>>>> +        return;
>>>>>>>>> +    }
>>>>>>>>> +    if (!action_parse_uint16(ctx, &timeout, "fdb_local flow 
>>>>>>>>> timeout")) {
>>>>>>>>> +        return;
>>>>>>>>> +    }
>>>>>>>>> +    fdb_local->timeout = timeout;
>>>>>>>>> +    lexer_force_match(ctx->lexer, LEX_T_RPAREN); /* Skip ')'. */
>>>>>>>>> +}
>>>>>>>>> +
>>>>>>>>> +static void
>>>>>>>>> +format_COMMIT_FDB_LOCAL(const struct ovnact_commit_fdb_local 
>>>>>>>>> *fdb_local,
>>>>>>>>> +                      struct ds *s)
>>>>>>>>> +{
>>>>>>>>> +    ds_put_format(s, "commit_fdb_local(timeout=%u);", 
>>>>>>>>> fdb_local->timeout);
>>>>>>>>> +}
>>>>>>>>> +
>>>>>>>>> +static void
>>>>>>>>> +ovnact_commit_fdb_local_free(struct ovnact_commit_fdb_local 
>>>>>>>>> *fdb_local OVS_UNUSED)
>>>>>>>>> +{
>>>>>>>>> +}
>>>>>>>>> +
>>>>>>>>> +static void
>>>>>>>>> +commit_fdb_local_learn_action(struct ovnact_commit_fdb_local 
>>>>>>>>> *fdb_local,
>>>>>>>>> +                        struct ofpbuf *ofpacts, uint32_t cookie)
>>>>>>>>> +{
>>>>>>>>> +    struct ofpact_learn *ol = ofpact_put_LEARN(ofpacts);
>>>>>>>>> +    struct match match = MATCH_CATCHALL_INITIALIZER;
>>>>>>>>> +    struct ofpact_learn_spec *ol_spec;
>>>>>>>>> +    unsigned int imm_bytes;
>>>>>>>>> +    uint8_t *src_imm;
>>>>>>>>> +
>>>>>>>>> +    ol->flags = NX_LEARN_F_DELETE_LEARNED;
>>>>>>>>> +    ol->idle_timeout = fdb_local->timeout;
>>>>>>>>> +    ol->hard_timeout = OFP_FLOW_PERMANENT;
>>>>>>>>> +    ol->priority = OFP_DEFAULT_PRIORITY;
>>>>>>>>> +    ol->table_id = OFTABLE_GET_FDB;
>>>>>>>>> +    ol->cookie = htonll(cookie);
>>>>>>>>> +
>>>>>>>>> +    /* Match on metadata of the packet that created the new table. */
>>>>>>>>> +    ol_spec = ofpbuf_put_zeros(ofpacts, sizeof *ol_spec);
>>>>>>>>> +    ol_spec->dst.field = mf_from_id(MFF_METADATA);
>>>>>>>>> +    ol_spec->dst.ofs = 0;
>>>>>>>>> +    ol_spec->dst.n_bits = ol_spec->dst.field->n_bits;
>>>>>>>>> +    ol_spec->n_bits = ol_spec->dst.n_bits;
>>>>>>>>> +    ol_spec->dst_type = NX_LEARN_DST_MATCH;
>>>>>>>>> +    ol_spec->src_type = NX_LEARN_SRC_FIELD;
>>>>>>>>> +    ol_spec->src.field = mf_from_id(MFF_METADATA);
>>>>>>>>> +
>>>>>>>>> +    /* Match on metadata of the packet. */
>>>>>>>>> +    ol_spec = ofpbuf_put_zeros(ofpacts, sizeof *ol_spec);
>>>>>>>>> +    ol_spec->dst.field = mf_from_id(MFF_ETH_DST);
>>>>>>>>> +    ol_spec->dst.ofs = 0;
>>>>>>>>> +    ol_spec->dst.n_bits = ol_spec->dst.field->n_bits;
>>>>>>>>> +    ol_spec->n_bits = ol_spec->dst.n_bits;
>>>>>>>>> +    ol_spec->dst_type = NX_LEARN_DST_MATCH;
>>>>>>>>> +    ol_spec->src_type = NX_LEARN_SRC_FIELD;
>>>>>>>>> +    ol_spec->src.field = mf_from_id(MFF_ETH_SRC);
>>>>>>>>> +
>>>>>>>>> +
>>>>>>>>> +    /* Load MFF_LOG_OUTPORT from MFF_IN_PORT. */
>>>>>>>>> +    ol_spec = ofpbuf_put_zeros(ofpacts, sizeof *ol_spec);
>>>>>>>>> +    ol_spec->dst.field = mf_from_id(MFF_LOG_OUTPORT);
>>>>>>>>> +    ol_spec->dst.ofs = 0;
>>>>>>>>> +    ol_spec->dst.n_bits = ol_spec->dst.field->n_bits;
>>>>>>>>> +    ol_spec->n_bits = ol_spec->dst.n_bits;
>>>>>>>>> +    ol_spec->dst_type = NX_LEARN_DST_LOAD;
>>>>>>>>> +    ol_spec->src_type = NX_LEARN_SRC_FIELD;
>>>>>>>>> +    ol_spec->src.field = mf_from_id(MFF_LOG_INPORT);
>>>>>>>>> +
>>>>>>>>> +    ofpact_finish_LEARN(ofpacts, &ol);
>>>>>>>>> +}
>>>>>>>> 
>>>>>>>> A difference from today's SB.FDB centralized approach is that when
>>>>>>>> ovn-controller restarts these flows will be cleared, I think.
>>>>>>>> 
>>>>>>>> Are we OK with that?  I think so but if not what are the options to
>>>>>>>> avoid clearing the local fdb cache on restart?
>>>>>>>> 
>>>>>>> 
>>>>>>> OVS has to relearn the FDB  flows whenever tables are cleared.
>>>>>>> During this time, packets gets flooded. I need to think about possible 
>>>>>>> options if we want to retain FDB table.
>>>>>>> 
>>>>>>> Can we take this up as an enhancement in a separate patch
>>>>>>> once we identify a solution for this ?
>>>>>>> 
>>>>>> 
>>>>>> Sounds good to me.
>>>>>> 
>>>>>>>> Another difference with today's approach is that this avoids a
>>>>>>>> controller action, that's great!
>>>>>>>> 
>>>>>>>>> +
>>>>>>>>> +static void
>>>>>>>>> +encode_COMMIT_FDB_LOCAL(const struct ovnact_commit_fdb_local 
>>>>>>>>> *fdb_local,
>>>>>>>>> +                      const struct ovnact_encode_params *ep,
>>>>>>>>> +                      struct ofpbuf *ofpacts)
>>>>>>>>> +{
>>>>>>>>> +     commit_fdb_local_learn_action(fdb_local, ofpacts, 
>>>>>>>>> ep->lflow_uuid.parts[0]);
>>>>>>>>> +}
>>>>>>>>> +
>>>>>>>>> static void
>>>>>>>>> encode_MAC_CACHE_USE(const struct ovnact_null *null OVS_UNUSED,
>>>>>>>>>                   const struct ovnact_encode_params *ep,
>>>>>>>>> @@ -5451,6 +5543,8 @@ parse_action(struct action_context *ctx)
>>>>>>>>>      parse_sample(ctx);
>>>>>>>>>  } else if (lexer_match_id(ctx->lexer, "mac_cache_use")) {
>>>>>>>>>      ovnact_put_MAC_CACHE_USE(ctx->ovnacts);
>>>>>>>>> +    } else if (lexer_match_id(ctx->lexer, "commit_fdb_local")) {
>>>>>>>>> +        parse_commit_fdb_local(ctx, 
>>>>>>>>> ovnact_put_COMMIT_FDB_LOCAL(ctx->ovnacts));
>>>>>>>>>  } else {
>>>>>>>>>      lexer_syntax_error(ctx->lexer, "expecting action");
>>>>>>>>>  }
>>>>>>>>> diff --git a/northd/northd.c b/northd/northd.c
>>>>>>>>> index d1465ddf7..de18694a0 100644
>>>>>>>>> --- a/northd/northd.c
>>>>>>>>> +++ b/northd/northd.c
>>>>>>>>> @@ -1834,6 +1834,12 @@ localnet_can_learn_mac(const struct 
>>>>>>>>> nbrec_logical_switch_port *nbsp)
>>>>>>>>>  return smap_get_bool(&nbsp->options, "localnet_learn_fdb", false);
>>>>>>>>> }
>>>>>>>>> 
>>>>>>>>> +static bool
>>>>>>>>> +ls_is_fdb_local(const struct nbrec_logical_switch *nbs)
>>>>>>>>> +{
>>>>>>>>> +    return smap_get_bool(&nbs->other_config, "fdb_local", false);
>>>>>>>>> +}
>>>>>>>>> +
>>>>>>>> 
>>>>>>>> Personally, I'd prefer if we don't add another config knob and we just
>>>>>>>> make this the only way FDB works.  We could also document that the FDB
>>>>>>>> SB table should be deprecated.
>>>>>>>> 
>>>>>>> 
>>>>>>> If we want to make local FDB as default then, I think we need to handle 
>>>>>>> overlay 
>>>>>>> use case as well. Probably, we might have to add a new stage in logical 
>>>>>>> switch
>>>>>>> egress pipeline to learn FDB entries for packets coming over tunnel (or)
>>>>>>> something similar.
>>>>>>> 
>>>>>>> Can we take this up in a separate patch ?
>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>>> Given that we have a fix already for the original problem you were
>>>>>> trying to address I would prefer that we avoid adding new config knobs
>>>>>> and handle both the localnet and overlay cases at the same time.
>>>>>> 
>>>>> 
>>>>> Sure.
>>>>> 
>>>>>> OTOH, why is there a difference?
>>>>>> 
>>>>>> The learned flow loads MFF_LOG_INPORT (from the packet that triggers the
>>>>>> learn() action) into MFF_LOG_INPORT:
>>>>>> 
>>>>>> With Geneve and STT overlay MFF_LOG_INPORT is still correctly set.  With
>>>>>> VXLAN that's not the case but that's already a documented limitation,
>>>>>> LOG_INPORT is not available after VXLAN tunneling therefore features
>>>>>> that need it (like egress ACLs matching against ingress port
>>>>>> identifiers) are not supported:
>>>>>> 
>>>>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_ovn-2Dorg_ovn_blob_main_ovn-2Darchitecture.7.xml-23L2842&d=DwIDaQ&c=s883GpUCOChKOHiocYtGcg&r=2PQjSDR7A28z1kXE1ptSm6X36oL_nCq1XxeEt7FkLmA&m=LO8-TXDG00EgnhB_oBbKGhuFWEv1FgwWbp-oMgCaMjsrN1ow1XIgiNdpeQ2FpNyQ&s=AyiJd4NaZd3P4HGvFbMO9No0cO8bux35PJNhtr8ZGe4&e=
>>>>>>  
>>>>>> 
>>>>> 
>>>>> In case of overlay, ingress and egress pipelines are processed on 
>>>>> different nodes
>>>>> assuming source and dest VMs are on different nodes. I think this makes 
>>>>> local FDB
>>>>> to learn only local VMs MACs and it never learn remote VMs MACs.
>>>>> 
>>>>> Example: 
>>>>> 2 VMs with unknown addr set are connected to same logical switch where 
>>>>> vm1 is on hv1
>>>>> and vm2 is on hv2. When vm1 is sending ICMP req packet to vm2, ingress 
>>>>> pipeline happens
>>>>> on hv1 and it learns vm1-mac and  this packet gets flooded since vm2-mac 
>>>>> is not yet learned.
>>>>> Packet reaches hv2 over tunnel and egress pipeline is exercised on hv2. 
>>>>> Now, when vm2 is
>>>>> responding back, ingress pipeline happens on hv2 and it learns vm2-mac 
>>>>> and this packet also
>>>>> gets flooded since vm1-mac is not learnt on hv2 when ICMP req packet is 
>>>>> received.
>>>>> 
>>>>> 
>>>> 
>>>> Ah, I see your point, you're right.  So it does look like we'd need a
>>>> dedicated fdb learning stage in the egress pipeline.  But that is
>>>> probably not that terrible.
>>>> 


Hi Dumitru,

I started working on this patch, sorry for the delay.

On further investigation, I realised that we cannot permanently remove FDB 
table from southbound even
if we add new FDB leaning stage in egress pipeline.

Following is the example:

Two VMs part of different overlay subnets have unknown addr set whereVM1 is 
connected to 
subnet-1 (LS1) and is running on hv1  and VM2 isconnected to subnet-2 (LS2) and 
is running
on hv2. Both subnets are connected to same logical router(LR).

When VM1 is sending ICMP packet to VM2, pipeline stages gets processed as below.
  - ICMP Req on HV1: LS1-in -> LR-in -> LR-out -> LS2-in. HV1 learns VM1 MAC, 
packet gets
    flooded in LS2.
  - ICMP Req on HV2: LS2-out.
  - ICMP Resp on HV2: LS2-in -> LR-in -> LR-out -> LS1-in.  HV2 learns VM2 MAC, 
packet gets
    flooded in LS1.
  - ICMP Resp on HV1: LS1-out.
  - Every packet gets flooded since HV1 never learns VM2 MAC and HV2 never 
learns VM1 MAC.


I am thinking either we can make FDB local implementation only applicable for 
VLAN subnets with default disabled
(or) implement is for both VLAN and overlay subnets with default disabled by 
documenting the limitation and
explore the possibility of ignoring this setting in northd when multiple 
overlay subnets connected to same LR
have ports with unknown addr configured.

Please let me know your thoughts.

Thanks,
Naveen

>>>>>> One more thing we need to take care of in order to be able to make local
>>>>>> FDB the default is "FDB refresh":
>>>>>> 
>>>>>> 551527a5e68e ("controller: Update FDB timestamp")
>>>>>> 
>>>>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_ovn-2Dorg_ovn_commit_551527a5e68e7233ad80d212d549df98f13e37bc&d=DwIDaQ&c=s883GpUCOChKOHiocYtGcg&r=2PQjSDR7A28z1kXE1ptSm6X36oL_nCq1XxeEt7FkLmA&m=LO8-TXDG00EgnhB_oBbKGhuFWEv1FgwWbp-oMgCaMjsrN1ow1XIgiNdpeQ2FpNyQ&s=dNv1oya7S00oKbz1E9i5IyhqDOsEajuutcKSV6bIAfs&e=
>>>>>>  
>>>>>> 
>>>>> 
>>>>> Sure.
>>>>> 
>>> 
>>> Hi Naveen,
>>> 
>>> Just touching base on this again, I was curious if you're still planning
>>> to work on making the FDB table local; I still think it's beneficial in
>>> general.
>>> 
>>> Thanks,
>>> Dumitru
>> 
>> 
>> Hi Dumitru,
>> 
>> Yes, I will work on this. I will try to send out the patch by Jan 1st week.
>> 
> 
> Awesome, thanks Naveen!  No rush though, I just wanted to confirm
> whether you were interested in pursuing this.
> 
> Best regards,
> Dumitru
> 
> 

_______________________________________________
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to