> On 18-Dec-2023, at 7:26 PM, Dumitru Ceara <dce...@redhat.com> wrote:
> 
> On 11/30/23 16:32, Dumitru Ceara wrote:
>> On 11/30/23 15:54, Naveen Yerramneni wrote:
>>> 
>>> 
>>>> On 30-Nov-2023, at 6:06 PM, Dumitru Ceara <dce...@redhat.com> wrote:
>>>> 
>>>> On 11/30/23 09:45, Naveen Yerramneni wrote:
>>>>> 
>>>>> 
>>>>>> On 29-Nov-2023, at 2:24 PM, Dumitru Ceara <dce...@redhat.com> wrote:
>>>>>> 
>>>>>> On 11/29/23 07:45, naveen.yerramneni wrote:
>>>>>>> This functionality can be enabled at the logical switch level:
>>>>>>> - "other_config:fdb_local" can be used to enable/disable this
>>>>>>>  functionality, it is disabled by default.
>>>>>>> - "other_config:fdb_local_idle_timeout" sepcifies idle timeout
>>>>>>>  for locally learned fdb flows, default timeout is 300 secs.
>>>>>>> 
>>>>>>> If enabled, below lflow is added for each port that has unknown addr 
>>>>>>> set.
>>>>>>> - table=2 (ls_in_lookup_fdb), priority=100, match=(inport == <in_port>),
>>>>>>>  action=(commit_fdb_local(timeout=<timeout>); next;
>>>>>>> 
>>>>>>> New OVN action: "commit_fdb_local". This sets following OVS action.
>>>>>>> - 
>>>>>>> learn(table=71,idle_timeout=<timeout>,delete_learned,OXM_OF_METADATA[],
>>>>>>>        
>>>>>>> NXM_OF_ETH_DST[]=NXM_OF_ETH_SRC[],load:NXM_NX_REG14[]->NXM_NX_REG15[])
>>>>>>> 
>>>>>>> This is useful when OVN is managing VLAN network that has multiple ports
>>>>>>> set with unknown addr and localnet_learn_fdb is enabled. With this 
>>>>>>> config,
>>>>>>> if there is east-west traffic flowing between VMs part of same VLAN
>>>>>>> deployed on different hypervisors then, MAC addrs of the source and
>>>>>>> destination VMs keeps flapping between VM port and localnet port in 
>>>>>>> Southbound FDB table. Enabling fdb_local config makes fdb table local to
>>>>>>> the chassis and avoids MAC flapping.
>>>>>>> 
>>>>>>> Signed-off-by: Naveen Yerramneni <naveen.yerramn...@nutanix.com>
>>>>>>> ---
>>>>>> 
>>>>>> Hi Naveen,
>>>>>> 
>>>>>> Thanks a lot for the patch!
>>>>>> 
>>>>>> Just a note, we already have a fix for the east-west traffic that causes
>>>>>> FDB flapping when localnet is used:
>>>>>> 
>>>>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_ovn-2Dorg_ovn_commit_2acf91e9628e9481c48e4a6cec8ad5159fdd6d2e&d=DwICaQ&c=s883GpUCOChKOHiocYtGcg&r=2PQjSDR7A28z1kXE1ptSm6X36oL_nCq1XxeEt7FkLmA&m=kPuq992rikXYk63APGxlIpfqY3lPpreN9f4ha9pZKpodnVgE9KfjEUNozpPUFzUu&s=LP9_zs2Rj34vMx20ntbu-A3taXqKMJNVH2TLQyOXCh0&e=
>>>>>>  
>>>>>> 
>>>>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_ovn-2Dorg_ovn_commit_f3a14907fe2b1ecdcfddfbed595cd097b6efbe14&d=DwICaQ&c=s883GpUCOChKOHiocYtGcg&r=2PQjSDR7A28z1kXE1ptSm6X36oL_nCq1XxeEt7FkLmA&m=kPuq992rikXYk63APGxlIpfqY3lPpreN9f4ha9pZKpodnVgE9KfjEUNozpPUFzUu&s=gsUGtjyf9gSOr1LkcCH0O6MB1_tjXi9fuTgwEFgbRx8&e=
>>>>>>  
>>>>>> 
>>>>>> In general, however, I think it's a very good idea to move the FDB away
>>>>>> from the Southbound and make it local to each hypervisor.  That reduces
>>>>>> load on the Southbound among other things.
>>>>>> 
>>>>> 
>>>>> Hi Dumitru,
>>>>> 
>>>>> Thanks for informing about the patches.
>>>>> Yes, local FDB reduces load on southbound.
>>>>> 
>>>>> 
>>>>>>> include/ovn/actions.h |   7 +++
>>>>>>> lib/actions.c         |  94 ++++++++++++++++++++++++++++++++++++
>>>>>>> northd/northd.c       |  26 ++++++++++
>>>>>>> ovn-nb.xml            |  14 ++++++
>>>>>>> tests/ovn.at          | 108 ++++++++++++++++++++++++++++++++++++++++++
>>>>>>> utilities/ovn-trace.c |   2 +
>>>>>>> 6 files changed, 251 insertions(+)
>>>>>>> 
>>>>>>> diff --git a/include/ovn/actions.h b/include/ovn/actions.h
>>>>>>> index 49cfe0624..85ac92cd3 100644
>>>>>>> --- a/include/ovn/actions.h
>>>>>>> +++ b/include/ovn/actions.h
>>>>>>> @@ -127,6 +127,7 @@ struct collector_set_ids;
>>>>>>>   OVNACT(CHK_LB_AFF,        ovnact_result)          \
>>>>>>>   OVNACT(SAMPLE,            ovnact_sample)          \
>>>>>>>   OVNACT(MAC_CACHE_USE,     ovnact_null)            \
>>>>>>> +    OVNACT(COMMIT_FDB_LOCAL,  ovnact_commit_fdb_local) \
>>>>>>> 
>>>>>>> /* enum ovnact_type, with a member OVNACT_<ENUM> for each action. */
>>>>>>> enum OVS_PACKED_ENUM ovnact_type {
>>>>>>> @@ -514,6 +515,12 @@ struct ovnact_commit_lb_aff {
>>>>>>>   uint16_t timeout;
>>>>>>> };
>>>>>>> 
>>>>>>> +/* OVNACT_COMMIT_FBD_LOCAL. */
>>>>>>> +struct ovnact_commit_fdb_local{
>>>>>>> +    struct ovnact ovnact;
>>>>>>> +    uint16_t timeout;  /* fdb_local flow timeout */
>>>>>>> +};
>>>>>>> +
>>>>>>> /* Internal use by the helpers below. */
>>>>>>> void ovnact_init(struct ovnact *, enum ovnact_type, size_t len);
>>>>>>> void *ovnact_put(struct ofpbuf *, enum ovnact_type, size_t len);
>>>>>>> diff --git a/lib/actions.c b/lib/actions.c
>>>>>>> index a73fe1a1e..f5aa78db1 100644
>>>>>>> --- a/lib/actions.c
>>>>>>> +++ b/lib/actions.c
>>>>>>> @@ -5236,6 +5236,98 @@ format_MAC_CACHE_USE(const struct ovnact_null 
>>>>>>> *null OVS_UNUSED, struct ds *s)
>>>>>>>   ds_put_cstr(s, "mac_cache_use;");
>>>>>>> }
>>>>>>> 
>>>>>>> +static void
>>>>>>> +parse_commit_fdb_local(struct action_context *ctx,
>>>>>>> +                     struct ovnact_commit_fdb_local *fdb_local)
>>>>>>> +{
>>>>>>> +    uint16_t timeout = 0;
>>>>>>> +    lexer_force_match(ctx->lexer, LEX_T_LPAREN); /* Skip '('. */
>>>>>>> +    if (!lexer_match_id(ctx->lexer, "timeout")) {
>>>>>>> +        lexer_syntax_error(ctx->lexer, "invalid parameter");
>>>>>>> +        return;
>>>>>>> +    }
>>>>>>> +    if (!lexer_force_match(ctx->lexer, LEX_T_EQUALS)) {
>>>>>>> +        lexer_syntax_error(ctx->lexer, "invalid parameter");
>>>>>>> +        return;
>>>>>>> +    }
>>>>>>> +    if (!action_parse_uint16(ctx, &timeout, "fdb_local flow timeout")) 
>>>>>>> {
>>>>>>> +        return;
>>>>>>> +    }
>>>>>>> +    fdb_local->timeout = timeout;
>>>>>>> +    lexer_force_match(ctx->lexer, LEX_T_RPAREN); /* Skip ')'. */
>>>>>>> +}
>>>>>>> +
>>>>>>> +static void
>>>>>>> +format_COMMIT_FDB_LOCAL(const struct ovnact_commit_fdb_local 
>>>>>>> *fdb_local,
>>>>>>> +                      struct ds *s)
>>>>>>> +{
>>>>>>> +    ds_put_format(s, "commit_fdb_local(timeout=%u);", 
>>>>>>> fdb_local->timeout);
>>>>>>> +}
>>>>>>> +
>>>>>>> +static void
>>>>>>> +ovnact_commit_fdb_local_free(struct ovnact_commit_fdb_local *fdb_local 
>>>>>>> OVS_UNUSED)
>>>>>>> +{
>>>>>>> +}
>>>>>>> +
>>>>>>> +static void
>>>>>>> +commit_fdb_local_learn_action(struct ovnact_commit_fdb_local 
>>>>>>> *fdb_local,
>>>>>>> +                        struct ofpbuf *ofpacts, uint32_t cookie)
>>>>>>> +{
>>>>>>> +    struct ofpact_learn *ol = ofpact_put_LEARN(ofpacts);
>>>>>>> +    struct match match = MATCH_CATCHALL_INITIALIZER;
>>>>>>> +    struct ofpact_learn_spec *ol_spec;
>>>>>>> +    unsigned int imm_bytes;
>>>>>>> +    uint8_t *src_imm;
>>>>>>> +
>>>>>>> +    ol->flags = NX_LEARN_F_DELETE_LEARNED;
>>>>>>> +    ol->idle_timeout = fdb_local->timeout;
>>>>>>> +    ol->hard_timeout = OFP_FLOW_PERMANENT;
>>>>>>> +    ol->priority = OFP_DEFAULT_PRIORITY;
>>>>>>> +    ol->table_id = OFTABLE_GET_FDB;
>>>>>>> +    ol->cookie = htonll(cookie);
>>>>>>> +
>>>>>>> +    /* Match on metadata of the packet that created the new table. */
>>>>>>> +    ol_spec = ofpbuf_put_zeros(ofpacts, sizeof *ol_spec);
>>>>>>> +    ol_spec->dst.field = mf_from_id(MFF_METADATA);
>>>>>>> +    ol_spec->dst.ofs = 0;
>>>>>>> +    ol_spec->dst.n_bits = ol_spec->dst.field->n_bits;
>>>>>>> +    ol_spec->n_bits = ol_spec->dst.n_bits;
>>>>>>> +    ol_spec->dst_type = NX_LEARN_DST_MATCH;
>>>>>>> +    ol_spec->src_type = NX_LEARN_SRC_FIELD;
>>>>>>> +    ol_spec->src.field = mf_from_id(MFF_METADATA);
>>>>>>> +
>>>>>>> +    /* Match on metadata of the packet. */
>>>>>>> +    ol_spec = ofpbuf_put_zeros(ofpacts, sizeof *ol_spec);
>>>>>>> +    ol_spec->dst.field = mf_from_id(MFF_ETH_DST);
>>>>>>> +    ol_spec->dst.ofs = 0;
>>>>>>> +    ol_spec->dst.n_bits = ol_spec->dst.field->n_bits;
>>>>>>> +    ol_spec->n_bits = ol_spec->dst.n_bits;
>>>>>>> +    ol_spec->dst_type = NX_LEARN_DST_MATCH;
>>>>>>> +    ol_spec->src_type = NX_LEARN_SRC_FIELD;
>>>>>>> +    ol_spec->src.field = mf_from_id(MFF_ETH_SRC);
>>>>>>> +
>>>>>>> +
>>>>>>> +    /* Load MFF_LOG_OUTPORT from MFF_IN_PORT. */
>>>>>>> +    ol_spec = ofpbuf_put_zeros(ofpacts, sizeof *ol_spec);
>>>>>>> +    ol_spec->dst.field = mf_from_id(MFF_LOG_OUTPORT);
>>>>>>> +    ol_spec->dst.ofs = 0;
>>>>>>> +    ol_spec->dst.n_bits = ol_spec->dst.field->n_bits;
>>>>>>> +    ol_spec->n_bits = ol_spec->dst.n_bits;
>>>>>>> +    ol_spec->dst_type = NX_LEARN_DST_LOAD;
>>>>>>> +    ol_spec->src_type = NX_LEARN_SRC_FIELD;
>>>>>>> +    ol_spec->src.field = mf_from_id(MFF_LOG_INPORT);
>>>>>>> +
>>>>>>> +    ofpact_finish_LEARN(ofpacts, &ol);
>>>>>>> +}
>>>>>> 
>>>>>> A difference from today's SB.FDB centralized approach is that when
>>>>>> ovn-controller restarts these flows will be cleared, I think.
>>>>>> 
>>>>>> Are we OK with that?  I think so but if not what are the options to
>>>>>> avoid clearing the local fdb cache on restart?
>>>>>> 
>>>>> 
>>>>> OVS has to relearn the FDB  flows whenever tables are cleared.
>>>>> During this time, packets gets flooded. I need to think about possible 
>>>>> options if we want to retain FDB table.
>>>>> 
>>>>> Can we take this up as an enhancement in a separate patch
>>>>> once we identify a solution for this ?
>>>>> 
>>>> 
>>>> Sounds good to me.
>>>> 
>>>>>> Another difference with today's approach is that this avoids a
>>>>>> controller action, that's great!
>>>>>> 
>>>>>>> +
>>>>>>> +static void
>>>>>>> +encode_COMMIT_FDB_LOCAL(const struct ovnact_commit_fdb_local 
>>>>>>> *fdb_local,
>>>>>>> +                      const struct ovnact_encode_params *ep,
>>>>>>> +                      struct ofpbuf *ofpacts)
>>>>>>> +{
>>>>>>> +     commit_fdb_local_learn_action(fdb_local, ofpacts, 
>>>>>>> ep->lflow_uuid.parts[0]);
>>>>>>> +}
>>>>>>> +
>>>>>>> static void
>>>>>>> encode_MAC_CACHE_USE(const struct ovnact_null *null OVS_UNUSED,
>>>>>>>                    const struct ovnact_encode_params *ep,
>>>>>>> @@ -5451,6 +5543,8 @@ parse_action(struct action_context *ctx)
>>>>>>>       parse_sample(ctx);
>>>>>>>   } else if (lexer_match_id(ctx->lexer, "mac_cache_use")) {
>>>>>>>       ovnact_put_MAC_CACHE_USE(ctx->ovnacts);
>>>>>>> +    } else if (lexer_match_id(ctx->lexer, "commit_fdb_local")) {
>>>>>>> +        parse_commit_fdb_local(ctx, 
>>>>>>> ovnact_put_COMMIT_FDB_LOCAL(ctx->ovnacts));
>>>>>>>   } else {
>>>>>>>       lexer_syntax_error(ctx->lexer, "expecting action");
>>>>>>>   }
>>>>>>> diff --git a/northd/northd.c b/northd/northd.c
>>>>>>> index d1465ddf7..de18694a0 100644
>>>>>>> --- a/northd/northd.c
>>>>>>> +++ b/northd/northd.c
>>>>>>> @@ -1834,6 +1834,12 @@ localnet_can_learn_mac(const struct 
>>>>>>> nbrec_logical_switch_port *nbsp)
>>>>>>>   return smap_get_bool(&nbsp->options, "localnet_learn_fdb", false);
>>>>>>> }
>>>>>>> 
>>>>>>> +static bool
>>>>>>> +ls_is_fdb_local(const struct nbrec_logical_switch *nbs)
>>>>>>> +{
>>>>>>> +    return smap_get_bool(&nbs->other_config, "fdb_local", false);
>>>>>>> +}
>>>>>>> +
>>>>>> 
>>>>>> Personally, I'd prefer if we don't add another config knob and we just
>>>>>> make this the only way FDB works.  We could also document that the FDB
>>>>>> SB table should be deprecated.
>>>>>> 
>>>>> 
>>>>> If we want to make local FDB as default then, I think we need to handle 
>>>>> overlay 
>>>>> use case as well. Probably, we might have to add a new stage in logical 
>>>>> switch
>>>>> egress pipeline to learn FDB entries for packets coming over tunnel (or)
>>>>> something similar.
>>>>> 
>>>>> Can we take this up in a separate patch ?
>>>>> 
>>>>> 
>>>> 
>>>> Given that we have a fix already for the original problem you were
>>>> trying to address I would prefer that we avoid adding new config knobs
>>>> and handle both the localnet and overlay cases at the same time.
>>>> 
>>> 
>>> Sure.
>>> 
>>>> OTOH, why is there a difference?
>>>> 
>>>> The learned flow loads MFF_LOG_INPORT (from the packet that triggers the
>>>> learn() action) into MFF_LOG_INPORT:
>>>> 
>>>> With Geneve and STT overlay MFF_LOG_INPORT is still correctly set.  With
>>>> VXLAN that's not the case but that's already a documented limitation,
>>>> LOG_INPORT is not available after VXLAN tunneling therefore features
>>>> that need it (like egress ACLs matching against ingress port
>>>> identifiers) are not supported:
>>>> 
>>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_ovn-2Dorg_ovn_blob_main_ovn-2Darchitecture.7.xml-23L2842&d=DwIDaQ&c=s883GpUCOChKOHiocYtGcg&r=2PQjSDR7A28z1kXE1ptSm6X36oL_nCq1XxeEt7FkLmA&m=LO8-TXDG00EgnhB_oBbKGhuFWEv1FgwWbp-oMgCaMjsrN1ow1XIgiNdpeQ2FpNyQ&s=AyiJd4NaZd3P4HGvFbMO9No0cO8bux35PJNhtr8ZGe4&e=
>>>>  
>>>> 
>>> 
>>> In case of overlay, ingress and egress pipelines are processed on different 
>>> nodes
>>> assuming source and dest VMs are on different nodes. I think this makes 
>>> local FDB
>>> to learn only local VMs MACs and it never learn remote VMs MACs.
>>> 
>>> Example: 
>>> 2 VMs with unknown addr set are connected to same logical switch where vm1 
>>> is on hv1
>>> and vm2 is on hv2. When vm1 is sending ICMP req packet to vm2, ingress 
>>> pipeline happens
>>> on hv1 and it learns vm1-mac and  this packet gets flooded since vm2-mac is 
>>> not yet learned.
>>> Packet reaches hv2 over tunnel and egress pipeline is exercised on hv2. 
>>> Now, when vm2 is
>>> responding back, ingress pipeline happens on hv2 and it learns vm2-mac and 
>>> this packet also
>>> gets flooded since vm1-mac is not learnt on hv2 when ICMP req packet is 
>>> received.
>>> 
>>> 
>> 
>> Ah, I see your point, you're right.  So it does look like we'd need a
>> dedicated fdb learning stage in the egress pipeline.  But that is
>> probably not that terrible.
>> 
>>>> One more thing we need to take care of in order to be able to make local
>>>> FDB the default is "FDB refresh":
>>>> 
>>>> 551527a5e68e ("controller: Update FDB timestamp")
>>>> 
>>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_ovn-2Dorg_ovn_commit_551527a5e68e7233ad80d212d549df98f13e37bc&d=DwIDaQ&c=s883GpUCOChKOHiocYtGcg&r=2PQjSDR7A28z1kXE1ptSm6X36oL_nCq1XxeEt7FkLmA&m=LO8-TXDG00EgnhB_oBbKGhuFWEv1FgwWbp-oMgCaMjsrN1ow1XIgiNdpeQ2FpNyQ&s=dNv1oya7S00oKbz1E9i5IyhqDOsEajuutcKSV6bIAfs&e=
>>>>  
>>>> 
>>> 
>>> Sure.
>>> 
> 
> Hi Naveen,
> 
> Just touching base on this again, I was curious if you're still planning
> to work on making the FDB table local; I still think it's beneficial in
> general.
> 
> Thanks,
> Dumitru


Hi Dumitru,

Yes, I will work on this. I will try to send out the patch by Jan 1st week.

Thanks,
Naveen

_______________________________________________
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to