> On 18-Dec-2023, at 7:26 PM, Dumitru Ceara <dce...@redhat.com> wrote:
>
> On 11/30/23 16:32, Dumitru Ceara wrote:
>> On 11/30/23 15:54, Naveen Yerramneni wrote:
>>>
>>>
>>>> On 30-Nov-2023, at 6:06 PM, Dumitru Ceara <dce...@redhat.com> wrote:
>>>>
>>>> On 11/30/23 09:45, Naveen Yerramneni wrote:
>>>>>
>>>>>
>>>>>> On 29-Nov-2023, at 2:24 PM, Dumitru Ceara <dce...@redhat.com> wrote:
>>>>>>
>>>>>> On 11/29/23 07:45, naveen.yerramneni wrote:
>>>>>>> This functionality can be enabled at the logical switch level:
>>>>>>> - "other_config:fdb_local" can be used to enable/disable this
>>>>>>> functionality, it is disabled by default.
>>>>>>> - "other_config:fdb_local_idle_timeout" sepcifies idle timeout
>>>>>>> for locally learned fdb flows, default timeout is 300 secs.
>>>>>>>
>>>>>>> If enabled, below lflow is added for each port that has unknown addr
>>>>>>> set.
>>>>>>> - table=2 (ls_in_lookup_fdb), priority=100, match=(inport == <in_port>),
>>>>>>> action=(commit_fdb_local(timeout=<timeout>); next;
>>>>>>>
>>>>>>> New OVN action: "commit_fdb_local". This sets following OVS action.
>>>>>>> -
>>>>>>> learn(table=71,idle_timeout=<timeout>,delete_learned,OXM_OF_METADATA[],
>>>>>>>
>>>>>>> NXM_OF_ETH_DST[]=NXM_OF_ETH_SRC[],load:NXM_NX_REG14[]->NXM_NX_REG15[])
>>>>>>>
>>>>>>> This is useful when OVN is managing VLAN network that has multiple ports
>>>>>>> set with unknown addr and localnet_learn_fdb is enabled. With this
>>>>>>> config,
>>>>>>> if there is east-west traffic flowing between VMs part of same VLAN
>>>>>>> deployed on different hypervisors then, MAC addrs of the source and
>>>>>>> destination VMs keeps flapping between VM port and localnet port in
>>>>>>> Southbound FDB table. Enabling fdb_local config makes fdb table local to
>>>>>>> the chassis and avoids MAC flapping.
>>>>>>>
>>>>>>> Signed-off-by: Naveen Yerramneni <naveen.yerramn...@nutanix.com>
>>>>>>> ---
>>>>>>
>>>>>> Hi Naveen,
>>>>>>
>>>>>> Thanks a lot for the patch!
>>>>>>
>>>>>> Just a note, we already have a fix for the east-west traffic that causes
>>>>>> FDB flapping when localnet is used:
>>>>>>
>>>>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_ovn-2Dorg_ovn_commit_2acf91e9628e9481c48e4a6cec8ad5159fdd6d2e&d=DwICaQ&c=s883GpUCOChKOHiocYtGcg&r=2PQjSDR7A28z1kXE1ptSm6X36oL_nCq1XxeEt7FkLmA&m=kPuq992rikXYk63APGxlIpfqY3lPpreN9f4ha9pZKpodnVgE9KfjEUNozpPUFzUu&s=LP9_zs2Rj34vMx20ntbu-A3taXqKMJNVH2TLQyOXCh0&e=
>>>>>>
>>>>>>
>>>>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_ovn-2Dorg_ovn_commit_f3a14907fe2b1ecdcfddfbed595cd097b6efbe14&d=DwICaQ&c=s883GpUCOChKOHiocYtGcg&r=2PQjSDR7A28z1kXE1ptSm6X36oL_nCq1XxeEt7FkLmA&m=kPuq992rikXYk63APGxlIpfqY3lPpreN9f4ha9pZKpodnVgE9KfjEUNozpPUFzUu&s=gsUGtjyf9gSOr1LkcCH0O6MB1_tjXi9fuTgwEFgbRx8&e=
>>>>>>
>>>>>>
>>>>>> In general, however, I think it's a very good idea to move the FDB away
>>>>>> from the Southbound and make it local to each hypervisor. That reduces
>>>>>> load on the Southbound among other things.
>>>>>>
>>>>>
>>>>> Hi Dumitru,
>>>>>
>>>>> Thanks for informing about the patches.
>>>>> Yes, local FDB reduces load on southbound.
>>>>>
>>>>>
>>>>>>> include/ovn/actions.h | 7 +++
>>>>>>> lib/actions.c | 94 ++++++++++++++++++++++++++++++++++++
>>>>>>> northd/northd.c | 26 ++++++++++
>>>>>>> ovn-nb.xml | 14 ++++++
>>>>>>> tests/ovn.at | 108 ++++++++++++++++++++++++++++++++++++++++++
>>>>>>> utilities/ovn-trace.c | 2 +
>>>>>>> 6 files changed, 251 insertions(+)
>>>>>>>
>>>>>>> diff --git a/include/ovn/actions.h b/include/ovn/actions.h
>>>>>>> index 49cfe0624..85ac92cd3 100644
>>>>>>> --- a/include/ovn/actions.h
>>>>>>> +++ b/include/ovn/actions.h
>>>>>>> @@ -127,6 +127,7 @@ struct collector_set_ids;
>>>>>>> OVNACT(CHK_LB_AFF, ovnact_result) \
>>>>>>> OVNACT(SAMPLE, ovnact_sample) \
>>>>>>> OVNACT(MAC_CACHE_USE, ovnact_null) \
>>>>>>> + OVNACT(COMMIT_FDB_LOCAL, ovnact_commit_fdb_local) \
>>>>>>>
>>>>>>> /* enum ovnact_type, with a member OVNACT_<ENUM> for each action. */
>>>>>>> enum OVS_PACKED_ENUM ovnact_type {
>>>>>>> @@ -514,6 +515,12 @@ struct ovnact_commit_lb_aff {
>>>>>>> uint16_t timeout;
>>>>>>> };
>>>>>>>
>>>>>>> +/* OVNACT_COMMIT_FBD_LOCAL. */
>>>>>>> +struct ovnact_commit_fdb_local{
>>>>>>> + struct ovnact ovnact;
>>>>>>> + uint16_t timeout; /* fdb_local flow timeout */
>>>>>>> +};
>>>>>>> +
>>>>>>> /* Internal use by the helpers below. */
>>>>>>> void ovnact_init(struct ovnact *, enum ovnact_type, size_t len);
>>>>>>> void *ovnact_put(struct ofpbuf *, enum ovnact_type, size_t len);
>>>>>>> diff --git a/lib/actions.c b/lib/actions.c
>>>>>>> index a73fe1a1e..f5aa78db1 100644
>>>>>>> --- a/lib/actions.c
>>>>>>> +++ b/lib/actions.c
>>>>>>> @@ -5236,6 +5236,98 @@ format_MAC_CACHE_USE(const struct ovnact_null
>>>>>>> *null OVS_UNUSED, struct ds *s)
>>>>>>> ds_put_cstr(s, "mac_cache_use;");
>>>>>>> }
>>>>>>>
>>>>>>> +static void
>>>>>>> +parse_commit_fdb_local(struct action_context *ctx,
>>>>>>> + struct ovnact_commit_fdb_local *fdb_local)
>>>>>>> +{
>>>>>>> + uint16_t timeout = 0;
>>>>>>> + lexer_force_match(ctx->lexer, LEX_T_LPAREN); /* Skip '('. */
>>>>>>> + if (!lexer_match_id(ctx->lexer, "timeout")) {
>>>>>>> + lexer_syntax_error(ctx->lexer, "invalid parameter");
>>>>>>> + return;
>>>>>>> + }
>>>>>>> + if (!lexer_force_match(ctx->lexer, LEX_T_EQUALS)) {
>>>>>>> + lexer_syntax_error(ctx->lexer, "invalid parameter");
>>>>>>> + return;
>>>>>>> + }
>>>>>>> + if (!action_parse_uint16(ctx, &timeout, "fdb_local flow timeout"))
>>>>>>> {
>>>>>>> + return;
>>>>>>> + }
>>>>>>> + fdb_local->timeout = timeout;
>>>>>>> + lexer_force_match(ctx->lexer, LEX_T_RPAREN); /* Skip ')'. */
>>>>>>> +}
>>>>>>> +
>>>>>>> +static void
>>>>>>> +format_COMMIT_FDB_LOCAL(const struct ovnact_commit_fdb_local
>>>>>>> *fdb_local,
>>>>>>> + struct ds *s)
>>>>>>> +{
>>>>>>> + ds_put_format(s, "commit_fdb_local(timeout=%u);",
>>>>>>> fdb_local->timeout);
>>>>>>> +}
>>>>>>> +
>>>>>>> +static void
>>>>>>> +ovnact_commit_fdb_local_free(struct ovnact_commit_fdb_local *fdb_local
>>>>>>> OVS_UNUSED)
>>>>>>> +{
>>>>>>> +}
>>>>>>> +
>>>>>>> +static void
>>>>>>> +commit_fdb_local_learn_action(struct ovnact_commit_fdb_local
>>>>>>> *fdb_local,
>>>>>>> + struct ofpbuf *ofpacts, uint32_t cookie)
>>>>>>> +{
>>>>>>> + struct ofpact_learn *ol = ofpact_put_LEARN(ofpacts);
>>>>>>> + struct match match = MATCH_CATCHALL_INITIALIZER;
>>>>>>> + struct ofpact_learn_spec *ol_spec;
>>>>>>> + unsigned int imm_bytes;
>>>>>>> + uint8_t *src_imm;
>>>>>>> +
>>>>>>> + ol->flags = NX_LEARN_F_DELETE_LEARNED;
>>>>>>> + ol->idle_timeout = fdb_local->timeout;
>>>>>>> + ol->hard_timeout = OFP_FLOW_PERMANENT;
>>>>>>> + ol->priority = OFP_DEFAULT_PRIORITY;
>>>>>>> + ol->table_id = OFTABLE_GET_FDB;
>>>>>>> + ol->cookie = htonll(cookie);
>>>>>>> +
>>>>>>> + /* Match on metadata of the packet that created the new table. */
>>>>>>> + ol_spec = ofpbuf_put_zeros(ofpacts, sizeof *ol_spec);
>>>>>>> + ol_spec->dst.field = mf_from_id(MFF_METADATA);
>>>>>>> + ol_spec->dst.ofs = 0;
>>>>>>> + ol_spec->dst.n_bits = ol_spec->dst.field->n_bits;
>>>>>>> + ol_spec->n_bits = ol_spec->dst.n_bits;
>>>>>>> + ol_spec->dst_type = NX_LEARN_DST_MATCH;
>>>>>>> + ol_spec->src_type = NX_LEARN_SRC_FIELD;
>>>>>>> + ol_spec->src.field = mf_from_id(MFF_METADATA);
>>>>>>> +
>>>>>>> + /* Match on metadata of the packet. */
>>>>>>> + ol_spec = ofpbuf_put_zeros(ofpacts, sizeof *ol_spec);
>>>>>>> + ol_spec->dst.field = mf_from_id(MFF_ETH_DST);
>>>>>>> + ol_spec->dst.ofs = 0;
>>>>>>> + ol_spec->dst.n_bits = ol_spec->dst.field->n_bits;
>>>>>>> + ol_spec->n_bits = ol_spec->dst.n_bits;
>>>>>>> + ol_spec->dst_type = NX_LEARN_DST_MATCH;
>>>>>>> + ol_spec->src_type = NX_LEARN_SRC_FIELD;
>>>>>>> + ol_spec->src.field = mf_from_id(MFF_ETH_SRC);
>>>>>>> +
>>>>>>> +
>>>>>>> + /* Load MFF_LOG_OUTPORT from MFF_IN_PORT. */
>>>>>>> + ol_spec = ofpbuf_put_zeros(ofpacts, sizeof *ol_spec);
>>>>>>> + ol_spec->dst.field = mf_from_id(MFF_LOG_OUTPORT);
>>>>>>> + ol_spec->dst.ofs = 0;
>>>>>>> + ol_spec->dst.n_bits = ol_spec->dst.field->n_bits;
>>>>>>> + ol_spec->n_bits = ol_spec->dst.n_bits;
>>>>>>> + ol_spec->dst_type = NX_LEARN_DST_LOAD;
>>>>>>> + ol_spec->src_type = NX_LEARN_SRC_FIELD;
>>>>>>> + ol_spec->src.field = mf_from_id(MFF_LOG_INPORT);
>>>>>>> +
>>>>>>> + ofpact_finish_LEARN(ofpacts, &ol);
>>>>>>> +}
>>>>>>
>>>>>> A difference from today's SB.FDB centralized approach is that when
>>>>>> ovn-controller restarts these flows will be cleared, I think.
>>>>>>
>>>>>> Are we OK with that? I think so but if not what are the options to
>>>>>> avoid clearing the local fdb cache on restart?
>>>>>>
>>>>>
>>>>> OVS has to relearn the FDB flows whenever tables are cleared.
>>>>> During this time, packets gets flooded. I need to think about possible
>>>>> options if we want to retain FDB table.
>>>>>
>>>>> Can we take this up as an enhancement in a separate patch
>>>>> once we identify a solution for this ?
>>>>>
>>>>
>>>> Sounds good to me.
>>>>
>>>>>> Another difference with today's approach is that this avoids a
>>>>>> controller action, that's great!
>>>>>>
>>>>>>> +
>>>>>>> +static void
>>>>>>> +encode_COMMIT_FDB_LOCAL(const struct ovnact_commit_fdb_local
>>>>>>> *fdb_local,
>>>>>>> + const struct ovnact_encode_params *ep,
>>>>>>> + struct ofpbuf *ofpacts)
>>>>>>> +{
>>>>>>> + commit_fdb_local_learn_action(fdb_local, ofpacts,
>>>>>>> ep->lflow_uuid.parts[0]);
>>>>>>> +}
>>>>>>> +
>>>>>>> static void
>>>>>>> encode_MAC_CACHE_USE(const struct ovnact_null *null OVS_UNUSED,
>>>>>>> const struct ovnact_encode_params *ep,
>>>>>>> @@ -5451,6 +5543,8 @@ parse_action(struct action_context *ctx)
>>>>>>> parse_sample(ctx);
>>>>>>> } else if (lexer_match_id(ctx->lexer, "mac_cache_use")) {
>>>>>>> ovnact_put_MAC_CACHE_USE(ctx->ovnacts);
>>>>>>> + } else if (lexer_match_id(ctx->lexer, "commit_fdb_local")) {
>>>>>>> + parse_commit_fdb_local(ctx,
>>>>>>> ovnact_put_COMMIT_FDB_LOCAL(ctx->ovnacts));
>>>>>>> } else {
>>>>>>> lexer_syntax_error(ctx->lexer, "expecting action");
>>>>>>> }
>>>>>>> diff --git a/northd/northd.c b/northd/northd.c
>>>>>>> index d1465ddf7..de18694a0 100644
>>>>>>> --- a/northd/northd.c
>>>>>>> +++ b/northd/northd.c
>>>>>>> @@ -1834,6 +1834,12 @@ localnet_can_learn_mac(const struct
>>>>>>> nbrec_logical_switch_port *nbsp)
>>>>>>> return smap_get_bool( ->options, "localnet_learn_fdb", false);
>>>>>>> }
>>>>>>>
>>>>>>> +static bool
>>>>>>> +ls_is_fdb_local(const struct nbrec_logical_switch *nbs)
>>>>>>> +{
>>>>>>> + return smap_get_bool(&nbs->other_config, "fdb_local", false);
>>>>>>> +}
>>>>>>> +
>>>>>>
>>>>>> Personally, I'd prefer if we don't add another config knob and we just
>>>>>> make this the only way FDB works. We could also document that the FDB
>>>>>> SB table should be deprecated.
>>>>>>
>>>>>
>>>>> If we want to make local FDB as default then, I think we need to handle
>>>>> overlay
>>>>> use case as well. Probably, we might have to add a new stage in logical
>>>>> switch
>>>>> egress pipeline to learn FDB entries for packets coming over tunnel (or)
>>>>> something similar.
>>>>>
>>>>> Can we take this up in a separate patch ?
>>>>>
>>>>>
>>>>
>>>> Given that we have a fix already for the original problem you were
>>>> trying to address I would prefer that we avoid adding new config knobs
>>>> and handle both the localnet and overlay cases at the same time.
>>>>
>>>
>>> Sure.
>>>
>>>> OTOH, why is there a difference?
>>>>
>>>> The learned flow loads MFF_LOG_INPORT (from the packet that triggers the
>>>> learn() action) into MFF_LOG_INPORT:
>>>>
>>>> With Geneve and STT overlay MFF_LOG_INPORT is still correctly set. With
>>>> VXLAN that's not the case but that's already a documented limitation,
>>>> LOG_INPORT is not available after VXLAN tunneling therefore features
>>>> that need it (like egress ACLs matching against ingress port
>>>> identifiers) are not supported:
>>>>
>>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_ovn-2Dorg_ovn_blob_main_ovn-2Darchitecture.7.xml-23L2842&d=DwIDaQ&c=s883GpUCOChKOHiocYtGcg&r=2PQjSDR7A28z1kXE1ptSm6X36oL_nCq1XxeEt7FkLmA&m=LO8-TXDG00EgnhB_oBbKGhuFWEv1FgwWbp-oMgCaMjsrN1ow1XIgiNdpeQ2FpNyQ&s=AyiJd4NaZd3P4HGvFbMO9No0cO8bux35PJNhtr8ZGe4&e=
>>>>
>>>>
>>>
>>> In case of overlay, ingress and egress pipelines are processed on different
>>> nodes
>>> assuming source and dest VMs are on different nodes. I think this makes
>>> local FDB
>>> to learn only local VMs MACs and it never learn remote VMs MACs.
>>>
>>> Example:
>>> 2 VMs with unknown addr set are connected to same logical switch where vm1
>>> is on hv1
>>> and vm2 is on hv2. When vm1 is sending ICMP req packet to vm2, ingress
>>> pipeline happens
>>> on hv1 and it learns vm1-mac and this packet gets flooded since vm2-mac is
>>> not yet learned.
>>> Packet reaches hv2 over tunnel and egress pipeline is exercised on hv2.
>>> Now, when vm2 is
>>> responding back, ingress pipeline happens on hv2 and it learns vm2-mac and
>>> this packet also
>>> gets flooded since vm1-mac is not learnt on hv2 when ICMP req packet is
>>> received.
>>>
>>>
>>
>> Ah, I see your point, you're right. So it does look like we'd need a
>> dedicated fdb learning stage in the egress pipeline. But that is
>> probably not that terrible.
>>
>>>> One more thing we need to take care of in order to be able to make local
>>>> FDB the default is "FDB refresh":
>>>>
>>>> 551527a5e68e ("controller: Update FDB timestamp")
>>>>
>>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_ovn-2Dorg_ovn_commit_551527a5e68e7233ad80d212d549df98f13e37bc&d=DwIDaQ&c=s883GpUCOChKOHiocYtGcg&r=2PQjSDR7A28z1kXE1ptSm6X36oL_nCq1XxeEt7FkLmA&m=LO8-TXDG00EgnhB_oBbKGhuFWEv1FgwWbp-oMgCaMjsrN1ow1XIgiNdpeQ2FpNyQ&s=dNv1oya7S00oKbz1E9i5IyhqDOsEajuutcKSV6bIAfs&e=
>>>>
>>>>
>>>
>>> Sure.
>>>
>
> Hi Naveen,
>
> Just touching base on this again, I was curious if you're still planning
> to work on making the FDB table local; I still think it's beneficial in
> general.
>
> Thanks,
> Dumitru
Hi Dumitru,
Yes, I will work on this. I will try to send out the patch by Jan 1st week.
Thanks,
Naveen
_______________________________________________
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev