For overlay subnets, all cross-host traffic exchanges are tunneled. For VLAN
subnets, we need to selectively tunnel traffic sent to or coming from the NF
ports. Consider a from-lport ACL applied to port p1 on host1. The NF ports nfp1
and nfp2 are on host2. A new option in LSP allows the NF ports to be linked.
The “nf-linked-port” in nfp1 is to be set to nfp2 and vice versa.
The ingress pipeline on host1 sets the outport to nfp1 and the packet is then
processed by table REMOTE_OUTPUT.

On host1
--------
REMOTE_OUTPUT (table 43):
It tunnels traffic destined to all non-local overlay ports to their associated
hosts. The Same rule is now also added for traffic to non-local NF ports. Thus
the packets from p1 get tunneled to host 2.

Upon reaching host2
-------------------
PHY_TO_LOG (table 0):
Existing priority 100 rule: for each geneve tunnel interface on the chassis,
copy info from header to inport, outport, metadata registers. Now same rule
also stores the tun intf id in a register (reg5[16..31]).

CHECK_LOOPBACK (table 46)
This table has a rule that clears all the registers. The change is to skip the
clearing of reg5[16..31].

Logical egress pipeline:
ls_out_stateful priority 120: If the outport is NF port, copy reg5[16..31]
(table0 had set it) to ct_label.tun_if_id.

LOCAL_OUTPUT (table 45)
When the packet comes out of the other NF port (nfp2), following two rules send
it back to the host that it originally came from:
Priority 110: For each NF port local to this host, following rule processes the
packet through CT of linked port:
  match: inport==nfp2 && RECIRC_BIT==0
  action: RECIRC_BIT = 1, ct(zone=nfp1’s zone, table=LOCAL), resubmit table 43

Priority 109: For each local {tunnel_id, nf port}, send the recirculated packet
using tun_if_id in ct zone:
  match: inport==nfp1 && RECIRC_BIT==1 && && ct_label.tun_if_id==<tun-id>
  action: tunnel packet using tun-id

Case where NF responds back on nfp1, instead of forwarding to nfp2
------------------------------------------------------------------
For example, a SYN packet from p1 got redirected to nfp1. Then the NF, which is
a firewall VM, drops the SYN and sends RST back on port nfp1. In this case,
looking up in linked port (nfp2) ct zone will not give anything. The following
rule uses ct.inv to identify such scenario and uses nfp1’s CT zone to send the
packet back. To achieve this, following 2 rules are installed:

in_network_function:
Priority 100 rule that allows packets incoming from NF type ports, is enhanced
with additional action to store the tun_if_id from ct_label into reg5[16..31].

LOCAL_OUTPUT (table 45)
Priority 110 rule: for recirculated packets, if ct (of the linked port) is
invalid, use the tun id from MFF_LOG_TUN_OFPORT to tunnel the packet back (as
CT zone info has been overwritten in the above 110 priority rule).
  match: inport==nfp1 && RECIRC_BIT==1 && ct.inv && reg5[16..31]==<tun-id>
  action: tunnel packet using tun-id

Signed-off-by: Sragdhara Datta Chaudhuri <[email protected]>
Acked-by: Naveen Yerramneni <[email protected]>
---
 NEWS                         |   5 +
 TODO.rst                     |   5 +
 controller/physical.c        | 313 ++++++++++++++++++++++++-
 include/ovn/logical-fields.h |   9 +
 lib/logical-fields.c         |  10 +
 northd/northd.c              |  82 +++++--
 ovn-nb.xml                   |  16 ++
 tests/ovn-controller.at      |   4 +-
 tests/ovn-nbctl.at           |   8 +-
 tests/ovn-northd.at          |  32 ++-
 tests/ovn.at                 | 442 +++++++++++++++++++++++++++--------
 tests/system-ovn.at          | 164 +++++++++++++
 12 files changed, 962 insertions(+), 128 deletions(-)

diff --git a/NEWS b/NEWS
index 66eb9e0b1..619d8f1e2 100644
--- a/NEWS
+++ b/NEWS
@@ -22,6 +22,11 @@ Post v25.09.0
      NOTE:
      * Network functions must not modify packet headers.
      * The feature is not supported in conjunction with Load Balancer.
+     * The feature is supported for both VLAN and overlay networks.
+       When network function is used in a VLAN network, geneve tunneling is 
used
+       for cross host traffic (between the chassis hosting network function and
+       the chassis hosting the port where the ACL is being enforced). Proper
+       MTU needs to be configured to accomodate this encapsulation.
    - Added disable_garp_rarp option to logical_router table in order to disable
      GARP/RARP announcements by all the peer ports of this logical router.
 
diff --git a/TODO.rst b/TODO.rst
index cda5f0d99..522389919 100644
--- a/TODO.rst
+++ b/TODO.rst
@@ -171,6 +171,11 @@ OVN To-do List
     allow for the eventual removal of the ovn\_datapath structure from the
     codebase.
 
+* Network function insertion
+
+  * Geneve tunnel is used for supporting this feature for VLAN network.
+    Extend the support over VxLAN tunnel as well.
+
 * CI
 
   * ovn-kubernetes: Only a subset of the ovn-kubernetes features is currently
diff --git a/controller/physical.c b/controller/physical.c
index 9ca535a6c..daae1e7c5 100644
--- a/controller/physical.c
+++ b/controller/physical.c
@@ -175,6 +175,8 @@ put_decapsulation(enum mf_field_id mff_ovn_geneve,
         put_move(MFF_TUN_ID, 0,  MFF_LOG_DATAPATH, 0, 24, ofpacts);
         put_move(mff_ovn_geneve, 16, MFF_LOG_INPORT, 0, 15, ofpacts);
         put_move(mff_ovn_geneve, 0, MFF_LOG_OUTPORT, 0, 16, ofpacts);
+        put_load(ofp_to_u16(tun->ofport), MFF_LOG_TUN_OFPORT,
+                 16, 16, ofpacts);
     } else if (tun->type == VXLAN) {
         /* Add flows for non-VTEP tunnels. Split VNI into two 12-bit
          * sections and use them for datapath and outport IDs. */
@@ -387,6 +389,15 @@ match_outport_dp_and_port_keys(struct match *match,
     match_set_reg(match, MFF_LOG_OUTPORT - MFF_REG0, port_key);
 }
 
+static void
+match_inport_dp_and_port_keys(struct match *match,
+                              uint32_t dp_key, uint32_t port_key)
+{
+    match_init_catchall(match);
+    match_set_metadata(match, htonll(dp_key));
+    match_set_reg(match, MFF_LOG_INPORT - MFF_REG0, port_key);
+}
+
 static struct sbrec_encap *
 find_additional_encap_for_chassis(const struct sbrec_port_binding *pb,
                                   const struct sbrec_chassis *chassis_rec)
@@ -452,7 +463,8 @@ put_remote_port_redirect_overlay(const struct 
sbrec_port_binding *binding,
                                  uint32_t port_key,
                                  struct match *match,
                                  struct ofpbuf *ofpacts_p,
-                                 struct ovn_desired_flow_table *flow_table)
+                                 struct ovn_desired_flow_table *flow_table,
+                                 bool allow_hairpin)
 {
     /* Setup encapsulation */
     for (size_t i = 0; i < ctx->n_encap_ips; i++) {
@@ -471,6 +483,14 @@ put_remote_port_redirect_overlay(const struct 
sbrec_port_binding *binding,
                          ofpacts_clone);
             }
 
+            /* Clear the MFF_INPORT if the same packet may need to go out from
+             * the same tunnel inport. */
+            if (allow_hairpin) {
+                put_stack(MFF_IN_PORT, ofpact_put_STACK_PUSH(ofpacts_clone));
+                put_load(ofp_to_u16(OFPP_NONE), MFF_IN_PORT, 0, 16,
+                         ofpacts_clone);
+            }
+
             const struct chassis_tunnel *tun;
             VECTOR_FOR_EACH (&tuns, tun) {
                 put_encapsulation(ctx->mff_ovn_geneve, tun, binding->datapath,
@@ -478,6 +498,11 @@ put_remote_port_redirect_overlay(const struct 
sbrec_port_binding *binding,
                 ofpact_put_OUTPUT(ofpacts_clone)->port = tun->ofport;
             }
             put_resubmit(OFTABLE_REMOTE_VTEP_OUTPUT, ofpacts_clone);
+
+            if (allow_hairpin) {
+                put_stack(MFF_IN_PORT, ofpact_put_STACK_POP(ofpacts_clone));
+            }
+
             ofctrl_add_flow(flow_table, OFTABLE_REMOTE_OUTPUT, 100,
                             binding->header_.uuid.parts[0], match,
                             ofpacts_clone, &binding->header_.uuid);
@@ -487,6 +512,229 @@ put_remote_port_redirect_overlay(const struct 
sbrec_port_binding *binding,
     }
 }
 
+static const struct sbrec_port_binding *
+get_binding_network_function_linked_port(
+                    struct ovsdb_idl_index *sbrec_port_binding_by_name,
+                    const struct sbrec_port_binding *binding)
+{
+    const char *nf_linked_name = smap_get(&binding->options,
+                                          "nf-linked-port");
+    if (!nf_linked_name) {
+        return NULL;
+    }
+    VLOG_DBG("get NF linked port_binding %s:%s",
+             binding->logical_port, nf_linked_name);
+    const struct sbrec_port_binding *nf_linked_port = lport_lookup_by_name(
+        sbrec_port_binding_by_name, nf_linked_name);
+    static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 1);
+    if (!nf_linked_port) {
+        VLOG_INFO_RL(&rl, "Binding not found for nf-linked-port"
+                  " %s", nf_linked_name);
+        return NULL;
+    }
+    if (strcmp(nf_linked_port->type, binding->type)) {
+        VLOG_ERR_RL(&rl, "Binding type mismatch between %s and "
+                  "nf-linked-port %s",
+                  binding->logical_port,  nf_linked_name);
+        return NULL;
+    }
+    const char *nf_linked_linked_name = smap_get(
+        &nf_linked_port->options, "nf-linked-port");
+    if (!nf_linked_linked_name || strcmp(nf_linked_linked_name,
+                                         binding->logical_port)) {
+        VLOG_INFO_RL(&rl, "LSP name %s does not match linked_linked_name",
+                  binding->logical_port);
+        return NULL;
+    }
+
+    return nf_linked_port;
+}
+
+static void
+send_traffic_by_tunnel(const struct sbrec_port_binding *binding,
+                       struct match *match,
+                       struct ofpbuf *ofpacts_p,
+                       uint32_t dp_key,
+                       uint32_t port_key,
+                       struct chassis_tunnel *tun,
+                       enum mf_field_id mff_ovn_geneve,
+                       struct ovn_desired_flow_table *flow_table)
+{
+    match_init_catchall(match);
+    ofpbuf_clear(ofpacts_p);
+
+    match_inport_dp_and_port_keys(match, dp_key, port_key);
+    match_set_reg_masked(match, MFF_LOG_FLAGS - MFF_REG0, MLF_RECIRC,
+                         MLF_RECIRC);
+    ovs_u128 of_tun_ct_label_id_val = {
+        .u64.hi = ((uint32_t) ofp_to_u16(tun->ofport)) << 16,
+    };
+    ovs_u128 of_tun_ct_label_id_mask = {
+        .u64.hi = 0x00000000ffff0000,
+    };
+
+    match_set_ct_label_masked(match, of_tun_ct_label_id_val,
+                              of_tun_ct_label_id_mask);
+
+    put_load(binding->datapath->tunnel_key, MFF_TUN_ID, 0, 24, ofpacts_p);
+    put_move(MFF_LOG_OUTPORT, 0, mff_ovn_geneve, 0, 32, ofpacts_p);
+    put_load(port_key, mff_ovn_geneve, 16, 15, ofpacts_p);
+
+    ofpact_put_OUTPUT(ofpacts_p)->port = tun->ofport;
+    ofctrl_add_flow(flow_table, OFTABLE_LOCAL_OUTPUT, 109,
+                    binding->header_.uuid.parts[0], match,
+                    ofpacts_p, &binding->header_.uuid);
+}
+
+static void
+put_redirect_overlay_to_source(const struct sbrec_port_binding *binding,
+                               int linked_ct,
+                               const struct hmap *chassis_tunnels,
+                               enum mf_field_id mff_ovn_geneve,
+                               struct match *match,
+                               struct ofpbuf *ofpacts_p,
+                               struct ovn_desired_flow_table *flow_table)
+{
+    uint32_t dp_key = binding->datapath->tunnel_key;
+    uint32_t port_key = binding->tunnel_key;
+
+    /* Say, a network function has ports nf1 and nf2. The source port p1 is on
+     * a different host. The packet redirected from p1 was tunneled to the NF
+     * host. In PHY_TO_LOG table the tunnel interface id is stored in
+     * MFF_LOG_TUN_OFPORT. The egress pipeline then commits it into ct_label
+     * tun_if_id in nf1's zone (out_stateful priority 120 rule). When the same
+     * packet comes out from nf2, two rules process it:
+     * first rule sets recirc bit to 1 and processes the packet through nf1's
+     * ct zone and resubmits to same table. When the recirculated packet comes
+     * back, the second rule (which checks recirc bit == 1) uses the tun_if_id
+     * from ct_label to send the packet back to p1's host.
+     */
+
+    /* Table 45 (LOCAL_OUTPUT), priority 110
+     * =====================================
+     *
+     * Each flow matches a logical inport to a nf port and checks if
+     * recirc bit is 0 (i.e. packet first time being processed by this table).
+     * The action processes the packet through ct zone of the linked nf port
+     * and resubmits to the same table after setting recirc bit to 1.
+     * match: inport == svc-port[i] && MLF_RECIRC_BIT = 0
+     * action: MLF_RECIRC_BIT = 1, ct(zone=linked-zone[i], table=LOCAL)
+     */
+    match_init_catchall(match);
+    ofpbuf_clear(ofpacts_p);
+    match_inport_dp_and_port_keys(match, dp_key, port_key);
+    match_set_dl_type(match, htons(ETH_TYPE_IP));
+    match_set_reg_masked(match, MFF_LOG_FLAGS - MFF_REG0, 0, MLF_RECIRC);
+
+    put_load(1, MFF_LOG_FLAGS, MLF_RECIRC_BIT, 1, ofpacts_p);
+    put_load(linked_ct, MFF_LOG_CT_ZONE, 0, 16, ofpacts_p);
+
+    struct ofpact_conntrack *ct = ofpact_put_CT(ofpacts_p);
+    ct->recirc_table = OFTABLE_LOCAL_OUTPUT;
+    ct->zone_src.field = mf_from_id(MFF_LOG_CT_ZONE);
+    ct->zone_src.ofs = 0;
+    ct->zone_src.n_bits = 16;
+    ct->flags = 0;
+    ct->alg = 0;
+    ofpact_finish(ofpacts_p, &ct->ofpact);
+
+    ofctrl_add_flow(flow_table, OFTABLE_LOCAL_OUTPUT, 110,
+                    binding->header_.uuid.parts[0], match,
+                    ofpacts_p, &binding->header_.uuid);
+
+    /* Table 45 (LOCAL_OUTPUT), priority 110
+     * In case NF is sending back a response on the port it received the
+     * packet on, instead of forwarding out of the other port (e.g. NF sending
+     * RST to the SYN received), the ct lookup in linked port's zone would
+     * fail. Based on ct.inv check the packet is then tunneled back using
+     * the tunnel id from this port's zone itself. The above rule has
+     * overwritten the zone info by now, so we recover it from the register
+     * that was populated by in_network_function stage with the tunnel id.
+     * match: inport == svc-port[i] && MLF_RECIRC_BIT = 1
+     *        && ct.inv && MFF_LOG_TUN_OFPORT == <tun-id>
+     * action: tunnel back using above tun-id
+     */
+    struct chassis_tunnel *tun;
+    HMAP_FOR_EACH (tun, hmap_node, chassis_tunnels) {
+        match_init_catchall(match);
+        ofpbuf_clear(ofpacts_p);
+        match_inport_dp_and_port_keys(match, dp_key, port_key);
+        match_set_reg_masked(match, MFF_LOG_FLAGS - MFF_REG0, MLF_RECIRC,
+                             MLF_RECIRC);
+        match_set_ct_state_masked(match, OVS_CS_F_INVALID, OVS_CS_F_INVALID);
+        match_set_reg_masked(match, MFF_LOG_TUN_OFPORT - MFF_REG0,
+                             ((uint32_t) ofp_to_u16(tun->ofport)) << 16,
+                             ((uint32_t) 0xffff) << 16);
+        put_load(binding->datapath->tunnel_key, MFF_TUN_ID, 0, 24, ofpacts_p);
+        put_move(MFF_LOG_OUTPORT, 0, mff_ovn_geneve, 0, 32, ofpacts_p);
+        put_load(port_key, mff_ovn_geneve, 16, 15, ofpacts_p);
+
+        ofpact_put_OUTPUT(ofpacts_p)->port = tun->ofport;
+        ofctrl_add_flow(flow_table, OFTABLE_LOCAL_OUTPUT, 110,
+                        binding->header_.uuid.parts[0], match,
+                        ofpacts_p, &binding->header_.uuid);
+    }
+
+    /* Table 45 (LOCAL_OUTPUT), priority 109
+     * =====================================
+     *
+     * A flow is installed For each {remote tunnel_id, nf port} combination. It
+     * matches the inport with the nf port and the ct_label.tun_if_id with the
+     * tunnel_id. Also checks if the recirc bit is 1 (i.e. packet being
+     * processed by this table second time). The action is to send the packet
+     * out using the tunnel interface.
+     * match: inport == svc-port[i] && MLF_RECIRC_BIT = 1
+     *        && ct_label.tun_if_id == <tun-id>
+     * action: tunnel back using tun-id
+     */
+    HMAP_FOR_EACH (tun, hmap_node, chassis_tunnels) {
+        send_traffic_by_tunnel(binding, match, ofpacts_p, dp_key, port_key,
+                               tun, mff_ovn_geneve, flow_table);
+    }
+    ofpbuf_clear(ofpacts_p);
+}
+
+static void
+put_redirect_overlay_to_source_from_nf_port(
+        const struct sbrec_port_binding *binding,
+        struct ovsdb_idl_index *sbrec_port_binding_by_name,
+        const struct hmap *chassis_tunnels,
+        const struct shash *ct_zones,
+        enum mf_field_id mff_ovn_geneve,
+        struct match *match,
+        struct ofpbuf *ofpacts_p,
+        struct ovn_desired_flow_table *flow_table)
+{
+    const struct sbrec_port_binding *linked_pb;
+    linked_pb = get_binding_network_function_linked_port(
+        sbrec_port_binding_by_name, binding);
+    static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 1);
+    if (!linked_pb) {
+        VLOG_INFO_RL(&rl, "Linked port not found for %s",
+                     binding->logical_port);
+        return;
+    }
+    struct zone_ids zone = get_zone_ids(binding, ct_zones);
+    if (!zone.ct) {
+        VLOG_INFO_RL(&rl, "Port zone not found for %s", binding->logical_port);
+        return;
+    }
+    struct zone_ids linked_zone = get_zone_ids(linked_pb, ct_zones);
+    if (!linked_zone.ct) {
+        VLOG_INFO_RL(&rl, "Linked port zone not found for %s",
+                     binding->logical_port);
+        return;
+    }
+    VLOG_DBG_RL(&rl, "Both port zones found for NF port %s",
+                binding->logical_port);
+    put_redirect_overlay_to_source(binding, linked_zone.ct, chassis_tunnels,
+                                   mff_ovn_geneve, match, ofpacts_p,
+                                   flow_table);
+    put_redirect_overlay_to_source(linked_pb, zone.ct, chassis_tunnels,
+                                   mff_ovn_geneve,  match, ofpacts_p,
+                                   flow_table);
+}
+
 static void
 put_remote_port_redirect_overlay_ha_remote(
     const struct sbrec_port_binding *binding,
@@ -962,6 +1210,29 @@ add_default_drop_flow(const struct physical_ctx *p_ctx,
     ofpbuf_uninit(&ofpacts);
 }
 
+/* Clear logical registers for network function datapaths.
+ * Resets all logical registers to zero except MFF_LOG_TUN_OFPORT, which is
+ * partially cleared. Bits 16-31 store the geneve tunnel interface ID of
+ * received packets and are preserved for the egress pipeline.
+ * Bits 0-15 are cleared.
+ */
+static void
+clear_registers_for_nf_datapath(struct ofpbuf *ofpacts_p)
+{
+    /* Clear all logical registers except MFF_LOG_TUN_OFPORT */
+    for (int i = 0; i < MFF_N_LOG_REGS; i++) {
+        if ((MFF_REG0 + i) != MFF_LOG_TUN_OFPORT) {
+            /* Clear entire 32-bit register */
+            put_load(0, MFF_REG0 + i, 0, 32, ofpacts_p);
+        }
+    }
+
+    /* Partially clear MFF_LOG_TUN_OFPORT register:
+     * - Bits 16-31: Preserve geneve tunnel ID for egress pipeline
+     * - Bits 0-15: Clear to zero for clean state */
+    put_load(0, MFF_LOG_TUN_OFPORT, 0, 16, ofpacts_p);
+}
+
 static void
 put_local_common_flows(uint32_t dp_key,
                        const struct sbrec_port_binding *pb,
@@ -1013,6 +1284,24 @@ put_local_common_flows(uint32_t dp_key,
                     pb->header_.uuid.parts[0], &match, ofpacts_p,
                     &pb->header_.uuid);
 
+    /* Table 46, Priority 1.
+     * =======================
+     * For datapath with network function ports, add a flow to clear only the
+     * required logical registers.
+     * In the default case, priority 0 rule clears all the registers.
+     */
+    bool nf_port = smap_get_bool(&pb->options, "is-nf", false);
+    if (nf_port) {
+        match_init_catchall(&match);
+        ofpbuf_clear(ofpacts_p);
+        match_set_metadata(&match, htonll(dp_key));
+        clear_registers_for_nf_datapath(ofpacts_p);
+        put_resubmit(OFTABLE_LOG_EGRESS_PIPELINE, ofpacts_p);
+        ofctrl_add_flow(flow_table, OFTABLE_CHECK_LOOPBACK, 1,
+                        pb->datapath->header_.uuid.parts[0], &match,
+                        ofpacts_p, &pb->datapath->header_.uuid);
+    }
+
     /* Table 64, Priority 100.
      * =======================
      *
@@ -1907,10 +2196,11 @@ consider_port_binding(const struct physical_ctx *ctx,
 
     /* Determine how the port is accessed. */
     enum access_type access_type = PORT_LOCAL;
+    bool is_nf = smap_get_bool(&binding->options, "is-nf", false);
     if (!ofport) {
         /* Enforce tunneling while we clone packets to additional chassis b/c
          * otherwise upstream switch won't flood the packet to both chassis. */
-        if (localnet_port && !binding->additional_chassis) {
+        if (localnet_port && !binding->additional_chassis && !is_nf) {
             ofport = u16_to_ofp(simap_get(ctx->patch_ofports,
                                           localnet_port->logical_port));
             if (!ofport) {
@@ -2140,6 +2430,20 @@ consider_port_binding(const struct physical_ctx *ctx,
                             binding->header_.uuid.parts[0], &match,
                             ofpacts_p, &binding->header_.uuid);
         }
+
+        /* Packets egressing from network function ports need to be sent to the
+         * source. */
+        if (is_nf && localnet_port) {
+            put_redirect_overlay_to_source_from_nf_port(
+                                 binding,
+                                 ctx->sbrec_port_binding_by_name,
+                                 ctx->chassis_tunnels,
+                                 ctx->ct_zones,
+                                 ctx->mff_ovn_geneve,
+                                 &match,
+                                 ofpacts_p,
+                                 flow_table);
+        }
     } else if (access_type == PORT_LOCALNET && !ctx->always_tunnel) {
         /* Remote port connected by localnet port */
         /* Table 45, priority 100.
@@ -2199,7 +2503,8 @@ consider_port_binding(const struct physical_ctx *ctx,
             &match, ofpacts_p, ctx->chassis_tunnels, flow_table);
     } else {
         put_remote_port_redirect_overlay(
-            binding, type, ctx, port_key, &match, ofpacts_p, flow_table);
+            binding, type, ctx, port_key, &match, ofpacts_p, flow_table,
+            is_nf);
     }
 out:
     if (ha_ch_ordered) {
@@ -3194,7 +3499,7 @@ physical_run(struct physical_ctx *p_ctx,
      *
      * Handles packets received from a VXLAN tunnel which get resubmitted to
      * OFTABLE_LOG_INGRESS_PIPELINE due to lack of needed metadata in VXLAN,
-     * explicitly skip sending back out any tunnels and resubmit to table 40
+     * explicitly skip sending back out any tunnels and resubmit to table 43
      * for local delivery, except packets which have MLF_ALLOW_LOOPBACK bit
      * set.
      */
diff --git a/include/ovn/logical-fields.h b/include/ovn/logical-fields.h
index 76925eac7..d2ba45240 100644
--- a/include/ovn/logical-fields.h
+++ b/include/ovn/logical-fields.h
@@ -42,6 +42,7 @@ enum ovn_controller_event {
                                        * (16..31 of the 32 bits). */
 #define MFF_LOG_INPORT     MFF_REG14  /* Logical input port (32 bits). */
 #define MFF_LOG_OUTPORT    MFF_REG15  /* Logical output port (32 bits). */
+#define MFF_LOG_TUN_OFPORT MFF_REG5   /* 16..31 of the 32 bits */
 
 /* Logical registers.
  *
@@ -104,6 +105,7 @@ enum mff_log_flags_bits {
     MLF_UNSNAT_NEW_BIT = 20,
     MLF_UNSNAT_NOT_TRACKED_BIT = 21,
     MLF_IGMP_IGMP_SNOOP_INJECT_BIT = 22,
+    MLF_RECIRC_BIT = 23,
     MLF_NETWORK_ID_START_BIT = 28,
     MLF_NETWORK_ID_END_BIT = 31,
 };
@@ -173,6 +175,9 @@ enum mff_log_flags {
     /* Indicate that this is an IGMP packet reinjected by ovn-controller. */
     MLF_IGMP_IGMP_SNOOP = (1 << MLF_IGMP_IGMP_SNOOP_INJECT_BIT),
 
+    /* Indicate the packet has been processed by LOCAL table once before. */
+    MLF_RECIRC = (1 << MLF_RECIRC_BIT),
+
     /* Assign network ID to packet to choose correct network for snat when
      * lb_force_snat_ip=router_ip. */
     MLF_NETWORK_ID = (OVN_MAX_NETWORK_ID << MLF_NETWORK_ID_START_BIT),
@@ -240,15 +245,19 @@ const struct ovn_field *ovn_field_from_name(const char 
*name);
 #define OVN_CT_OBS_STAGE_END_BIT 5
 #define OVN_CT_ALLOW_ESTABLISHED_BIT 6
 #define OVN_CT_NETWORK_FUNCTION_GROUP_BIT 7
+#define OVN_CT_TUN_IF_BIT 8
 
 #define OVN_CT_BLOCKED 1
 #define OVN_CT_NATTED  2
 #define OVN_CT_LB_SKIP_SNAT 4
 #define OVN_CT_LB_FORCE_SNAT 8
 #define OVN_CT_NETWORK_FUNCTION_GROUP 128
+#define OVN_CT_TUN_IF 256
 
 #define OVN_CT_NETWORK_FUNCTION_GROUP_ID_1ST_BIT 17
 #define OVN_CT_NETWORK_FUNCTION_GROUP_ID_END_BIT 24
+#define OVN_CT_TUN_IF_1ST_BIT 80
+#define OVN_CT_TUN_IF_END_BIT 95
 
 #define OVN_CT_ECMP_ETH_1ST_BIT 32
 #define OVN_CT_ECMP_ETH_END_BIT 79
diff --git a/lib/logical-fields.c b/lib/logical-fields.c
index 809ae39af..e19e6a757 100644
--- a/lib/logical-fields.c
+++ b/lib/logical-fields.c
@@ -233,6 +233,16 @@ ovn_init_symtab(struct shash *symtab)
                                     OVN_CT_NETWORK_FUNCTION_GROUP_ID_END_BIT)
                                     "]",
                                     WR_CT_COMMIT);
+    expr_symtab_add_subfield_scoped(symtab, "ct_label.tun_if", NULL,
+                                    "ct_label["
+                                    OVN_CT_STR(OVN_CT_TUN_IF_BIT)
+                                    "]",
+                                    WR_CT_COMMIT);
+    expr_symtab_add_subfield_scoped(symtab, "ct_label.tun_if_id", NULL,
+                                    "ct_label["
+                                    OVN_CT_STR(OVN_CT_TUN_IF_1ST_BIT) ".."
+                                    OVN_CT_STR(OVN_CT_TUN_IF_END_BIT) "]",
+                                    WR_CT_COMMIT);
 
     expr_symtab_add_field(symtab, "ct_state", MFF_CT_STATE, NULL, false);
 
diff --git a/northd/northd.c b/northd/northd.c
index 672fffcab..1e177b8d2 100644
--- a/northd/northd.c
+++ b/northd/northd.c
@@ -240,6 +240,10 @@ static const char *reg_ct_state[] = {
 #undef CS_STATE
 };
 
+/* Register used for storing tunnel openflow interface id, in a Logical Switch.
+ * Must match the MFF_LOG_TUN_OFPORT in logical-fields.h */
+#define REG_TUN_OFPORT "reg5[16..31]"
+
 /* Register used for temporarily store ECMP eth.src to avoid masked ct_label
  * access. It doesn't really occupy registers because the content of the
  * register is saved to stack and then restored in the same flow.
@@ -283,7 +287,7 @@ static const char *reg_ct_state[] = {
  * | R4 |                 REG_LB_IPV4                  |   |                   
                |
  * | R4 |    (>= IN_PRE_STATEFUL && <= IN_HAIRPIN)     | X |                   
                |
  * +----+----------------------------------------------+ X |           
REG_LB_IPV6             |
- * | R5 |                   UNUSED                     | R |      (>= 
IN_PRE_STATEFUL &&       |
+ * | R5 |           REG_TUN_OFPORT (16..31)            | R |      (>= 
IN_PRE_STATEFUL &&       |
  * +----+----------------------------------------------+ E |       <= 
IN_HAIRPIN)              |
  * | R6 |                   UNUSED                     | G |                   
                |
  * +----+----------------------------------------------+ 1 |                   
                |
@@ -17789,11 +17793,53 @@ network_function_get_active(const struct 
nbrec_network_function_group *nfg)
     return nfg->n_network_function ? nfg->network_function[0] : NULL;
 }
 
+/* For packets received on tunnel and egressing towards a network-function port
+ * commit the tunnel interface id in CT. This will be utilized when the packet
+ * comes out of the other network-function interface of the service VM. The
+ * packet then will be tunneled back to the source host. */
+static void
+build_lswitch_stateful_nf(struct ovn_port *op,
+                          struct ds *actions, struct ds *match,
+                          struct lflow_table *lflows,
+                          struct lflow_ref *lflow_ref)
+{
+    ds_clear(actions);
+    ds_clear(match);
+
+    ds_put_cstr(actions,
+                 "ct_commit { "
+                    "ct_mark.blocked = 0; "
+                    "ct_mark.allow_established = " REGBIT_ACL_PERSIST_ID "; "
+                    "ct_label.acl_id = " REG_ACL_ID "; "
+                    "ct_label.tun_if_id = " REG_TUN_OFPORT "; }; next;");
+    ds_put_format(match,
+                  "outport == %s && " REGBIT_ACL_LABEL" == 0", op->json_key);
+    ovn_lflow_add(lflows, op->od, S_SWITCH_OUT_STATEFUL, 120,
+                  ds_cstr(match), ds_cstr(actions), lflow_ref);
+
+    ds_clear(actions);
+    ds_clear(match);
+    ds_put_format(match,
+                  "outport == %s && " REGBIT_ACL_LABEL" == 1",
+                  op->json_key);
+    ds_put_cstr(actions,
+                 "ct_commit { "
+                    "ct_mark.blocked = 0; "
+                    "ct_mark.allow_established = " REGBIT_ACL_PERSIST_ID "; "
+                    "ct_label.acl_id = " REG_ACL_ID "; "
+                    "ct_mark.obs_stage = " REGBIT_ACL_OBS_STAGE "; "
+                    "ct_mark.obs_collector_id = " REG_OBS_COLLECTOR_ID_EST "; "
+                    "ct_label.obs_point_id = " REG_OBS_POINT_ID_EST "; "
+                    "ct_label.tun_if_id = " REG_TUN_OFPORT "; }; next;");
+    ovn_lflow_add(lflows, op->od, S_SWITCH_OUT_STATEFUL, 120,
+                  ds_cstr(match), ds_cstr(actions), lflow_ref);
+}
+
 static void
-consider_network_function(struct lflow_table *lflows,
-                          const struct ovn_datapath *od,
+consider_network_function(const struct ovn_datapath *od,
                           struct nbrec_network_function_group *nfg,
-                          struct lflow_ref *lflow_ref, bool ingress)
+                          bool ingress, struct lflow_table *lflows,
+                          struct lflow_ref *lflow_ref)
 {
     struct ds match = DS_EMPTY_INITIALIZER;
     struct ds action = DS_EMPTY_INITIALIZER;
@@ -17906,7 +17952,7 @@ consider_network_function(struct lflow_table *lflows,
      * match.
      */
     ds_put_format(&match, "inport == %s", input_port->json_key);
-    ds_put_format(&action, "next;");
+    ds_put_format(&action, REG_TUN_OFPORT" = ct_label.tun_if_id; next;");
     ovn_lflow_add(lflows, od, S_SWITCH_IN_NETWORK_FUNCTION, 100,
                   ds_cstr(&match), ds_cstr(&action), lflow_ref);
     ds_clear(&match);
@@ -17948,14 +17994,21 @@ consider_network_function(struct lflow_table *lflows,
     ovn_lflow_add(lflows, od, S_SWITCH_OUT_PRE_ACL, 110, ds_cstr(&match),
                   ds_cstr(&action), lflow_ref);
 
+    /* Priority 120 flows in out_stateful:
+     * If packet was received on a tunnel interface and being forwarded to a
+     * NF port, commit openflow tunnel interface id in ct_label.
+     */
+    build_lswitch_stateful_nf(output_port, &action, &match, lflows, lflow_ref);
+    build_lswitch_stateful_nf(input_port, &action, &match, lflows, lflow_ref);
+
     ds_destroy(&match);
     ds_destroy(&action);
 }
 
 static void
 build_network_function(const struct ovn_datapath *od,
-                       struct lflow_table *lflows,
                        const struct ls_port_group_table *ls_pgs,
+                       struct lflow_table *lflows,
                        struct lflow_ref *lflow_ref)
 {
     unsigned long *nfg_ingress_bitmap
@@ -18016,8 +18069,8 @@ build_network_function(const struct ovn_datapath *od,
                 continue;
             }
             nfg_bitmap = bitmap_set1(nfg_bitmap, nfg_id);
-            consider_network_function(lflows, od, acl->network_function_group,
-                                      lflow_ref, ingress);
+            consider_network_function(od, acl->network_function_group,
+                                      ingress, lflows, lflow_ref);
         }
     }
 
@@ -18041,9 +18094,8 @@ build_network_function(const struct ovn_datapath *od,
                         continue;
                     }
                     nfg_bitmap = bitmap_set1(nfg_bitmap, nfg_id);
-                    consider_network_function(lflows, od,
-                                              acl->network_function_group,
-                                              lflow_ref, ingress);
+                    consider_network_function(od, acl->network_function_group,
+                                              ingress, lflows, lflow_ref);
                 }
             }
         }
@@ -18093,8 +18145,7 @@ build_lswitch_and_lrouter_iterate_by_ls(struct 
ovn_datapath *od,
     build_mirror_default_lflow(od, lsi->lflows);
     build_lswitch_lflows_pre_acl_and_acl(od, lsi->lflows,
                                          lsi->meter_groups, NULL);
-    build_network_function(od, lsi->lflows, lsi->ls_port_groups,
-                           NULL);
+    build_network_function(od, lsi->ls_port_groups, lsi->lflows, NULL);
     build_fwd_group_lflows(od, lsi->lflows, NULL);
     build_lswitch_lflows_admission_control(od, lsi->lflows, NULL);
     build_lswitch_learn_fdb_od(od, lsi->lflows, NULL);
@@ -19119,9 +19170,8 @@ lflow_handle_ls_stateful_changes(struct ovsdb_idl_txn 
*ovnsb_txn,
                                 lflow_input->features,
                                 lflows,
                                 lflow_input->sbrec_acl_id_table);
-        build_network_function(od, lflows,
-                               lflow_input->ls_port_groups,
-                               ls_stateful_rec->lflow_ref);
+        build_network_function(od, lflow_input->ls_port_groups,
+                               lflows, ls_stateful_rec->lflow_ref);
 
         /* Sync the new flows to SB. */
         bool handled = lflow_ref_sync_lflows(
diff --git a/ovn-nb.xml b/ovn-nb.xml
index 7f7e70fab..dcb4ac635 100644
--- a/ovn-nb.xml
+++ b/ovn-nb.xml
@@ -1557,6 +1557,22 @@
           this port. The default value is <code>true</code>.
         </column>
 
+        <column name="options" key="is-nf"
+                type='{"type": "boolean"}'>
+          Needs to be set to <code>true</code> for Network Function ports.
+          These are the ports used as <code>inport</code> or
+          <code>outport</code> in Network_Function table.
+          The default value is <code>false</code>.
+        </column>
+
+        <column name="options" key="nf-linked-port"
+                type='{"type": "string"}'>
+          Each row in Network_Function table refers to two logical switch ports
+          under the columns <code>inport</code> and <code>outport</code>. The
+          port identified as <code>inport</code> needs to have this option set
+          to the port identified as <code>outport</code>, and vice-versa.
+        </column>
+
         <group title="VIF Plugging Options">
           <column name="options" key="vif-plug-type">
             If set, OVN will attempt to perform plugging of this VIF.  In order
diff --git a/tests/ovn-controller.at b/tests/ovn-controller.at
index 0b00906ae..b0af455e4 100644
--- a/tests/ovn-controller.at
+++ b/tests/ovn-controller.at
@@ -3717,8 +3717,8 @@ AT_CHECK([grep -c "reg10=0/0x10000" flood_flows], [0], 
[dnl
 # Geneve
 hv2_cookie="$(chassis_cookie hv2)"
 AT_CHECK_UNQUOTED([grep "cookie=$hv2_cookie," phy_to_log_flows], [0], [dnl
- cookie=$hv2_cookie, 
priority=120,arp,tun_metadata0=0,in_port="ovn-hv2-0",arp_op=2 
actions=load:0x1->NXM_NX_REG10[[16]],move:NXM_NX_TUN_ID[[0..23]]->OXM_OF_METADATA[[0..23]],move:NXM_NX_TUN_METADATA0[[16..30]]->NXM_NX_REG14[[0..14]],move:NXM_NX_TUN_METADATA0[[0..15]]->NXM_NX_REG15[[0..15]],resubmit(,OFTABLE_LOG_INGRESS_PIPELINE)
- cookie=$hv2_cookie, 
priority=120,icmp6,tun_metadata0=0,in_port="ovn-hv2-0",icmp_type=136,icmp_code=0
 
actions=load:0x1->NXM_NX_REG10[[16]],move:NXM_NX_TUN_ID[[0..23]]->OXM_OF_METADATA[[0..23]],move:NXM_NX_TUN_METADATA0[[16..30]]->NXM_NX_REG14[[0..14]],move:NXM_NX_TUN_METADATA0[[0..15]]->NXM_NX_REG15[[0..15]],resubmit(,OFTABLE_LOG_INGRESS_PIPELINE)
+ cookie=$hv2_cookie, 
priority=120,arp,tun_metadata0=0,in_port="ovn-hv2-0",arp_op=2 
actions=load:0x1->NXM_NX_REG10[[16]],move:NXM_NX_TUN_ID[[0..23]]->OXM_OF_METADATA[[0..23]],move:NXM_NX_TUN_METADATA0[[16..30]]->NXM_NX_REG14[[0..14]],move:NXM_NX_TUN_METADATA0[[0..15]]->NXM_NX_REG15[[0..15]],load:0x1->NXM_NX_REG5[[16..31]],resubmit(,OFTABLE_LOG_INGRESS_PIPELINE)
+ cookie=$hv2_cookie, 
priority=120,icmp6,tun_metadata0=0,in_port="ovn-hv2-0",icmp_type=136,icmp_code=0
 
actions=load:0x1->NXM_NX_REG10[[16]],move:NXM_NX_TUN_ID[[0..23]]->OXM_OF_METADATA[[0..23]],move:NXM_NX_TUN_METADATA0[[16..30]]->NXM_NX_REG14[[0..14]],move:NXM_NX_TUN_METADATA0[[0..15]]->NXM_NX_REG15[[0..15]],load:0x1->NXM_NX_REG5[[16..31]],resubmit(,OFTABLE_LOG_INGRESS_PIPELINE)
 ])
 
 # VXLAN
diff --git a/tests/ovn-nbctl.at b/tests/ovn-nbctl.at
index 0e8d78c98..5636b2969 100644
--- a/tests/ovn-nbctl.at
+++ b/tests/ovn-nbctl.at
@@ -3218,10 +3218,10 @@ AT_CHECK([check ovn-nbctl lsp-add ls0 svc-port0])
 AT_CHECK([check ovn-nbctl lsp-add ls0 svc-port1])
 AT_CHECK([check ovn-nbctl set logical_switch_port svc-port0 \
     options:receive_multicast=false options:lsp_learn_fdb=false \
-    options:network-function=true 
options:network-function-linked-port=svc-port1])
+    options:is-nf=true options:nf-linked-port=svc-port1])
 AT_CHECK([check ovn-nbctl set logical_switch_port svc-port1 \
     options:receive_multicast=false options:lsp_learn_fdb=false \
-    options:network-function=true 
options:network-function-linked-port=svc-port0])
+    options:is-nf=true options:nf-linked-port=svc-port0])
 
 # Create network-function.
 AT_CHECK([ovn-nbctl nf-add nf0 svc-port0 svc-port1])
@@ -3238,10 +3238,10 @@ AT_CHECK([check ovn-nbctl lsp-add ls0 svc-port4])
 AT_CHECK([check ovn-nbctl lsp-add ls0 svc-port5])
 AT_CHECK([check ovn-nbctl set logical_switch_port svc-port4 \
     options:receive_multicast=false options:lsp_learn_fdb=false \
-    options:network-function=true 
options:network-function-linked-port=svc-port5])
+    options:is-nf=true options:nf-linked-port=svc-port5])
 AT_CHECK([check ovn-nbctl set logical_switch_port svc-port5 \
     options:receive_multicast=false options:lsp_learn_fdb=false \
-    options:network-function=true 
options:network-function-linked-port=svc-port4])
+    options:is-nf=true options:nf-linked-port=svc-port4])
 AT_CHECK([ovn-nbctl --may-exist nf-add nf0 svc-port4 svc-port5])
 AT_CHECK([ovn-nbctl nf-list | uuidfilt], [0], [dnl
 <0> (nf0) in:svc-port4 out:svc-port5
diff --git a/tests/ovn-northd.at b/tests/ovn-northd.at
index 53dc6f409..a3f258173 100644
--- a/tests/ovn-northd.at
+++ b/tests/ovn-northd.at
@@ -18227,8 +18227,8 @@ AT_CHECK(
   [grep -E 'ls_(in|out)_network_function' sw0flows | ovn_strip_lflows | sort], 
[0], [dnl
   table=??(ls_in_network_function), priority=0    , match=(1), action=(next;)
   table=??(ls_in_network_function), priority=1    , match=(reg8[[21]] == 1), 
action=(drop;)
-  table=??(ls_in_network_function), priority=100  , match=(inport == 
"sw0-nf-p1"), action=(next;)
-  table=??(ls_in_network_function), priority=100  , match=(inport == 
"sw0-nf-p2"), action=(next;)
+  table=??(ls_in_network_function), priority=100  , match=(inport == 
"sw0-nf-p1"), action=(reg5[[16..31]] = ct_label.tun_if_id; next;)
+  table=??(ls_in_network_function), priority=100  , match=(inport == 
"sw0-nf-p2"), action=(reg5[[16..31]] = ct_label.tun_if_id; next;)
   table=??(ls_in_network_function), priority=100  , match=(reg8[[21]] == 1 && 
eth.mcast), action=(next;)
   table=??(ls_in_network_function), priority=99   , match=(reg8[[21]] == 1 && 
reg8[[22]] == 1 && reg0[[22..29]] == 1), action=(outport = "sw0-nf-p1"; output;)
   table=??(ls_out_network_function), priority=0    , match=(1), action=(next;)
@@ -18257,6 +18257,18 @@ AT_CHECK(
   table=??(ls_out_acl_eval    ), priority=65532, match=(ct.est && 
ct_mark.allow_established == 1), action=(reg8[[21]] = 
ct_label.network_function_group; reg8[[16]] = 1; next;)
 ])
 
+    AT_CHECK([grep "ls_out_stateful" sw0flows | ovn_strip_lflows], [0], [dnl
+  table=??(ls_out_stateful    ), priority=0    , match=(1), action=(next;)
+  table=??(ls_out_stateful    ), priority=100  , match=(reg0[[1]] == 1 && 
reg0[[13]] == 0), action=(ct_commit { ct_mark.blocked = 0; 
ct_mark.allow_established = reg0[[20]]; ct_label.acl_id = reg2[[16..31]]; 
ct_label.network_function_group = 0; ct_label.network_function_group_id = 0; }; 
next;)
+  table=??(ls_out_stateful    ), priority=100  , match=(reg0[[1]] == 1 && 
reg0[[13]] == 1), action=(ct_commit { ct_mark.blocked = 0; 
ct_mark.allow_established = reg0[[20]]; ct_mark.obs_stage = reg8[[19..20]]; 
ct_mark.obs_collector_id = reg8[[8..15]]; ct_label.obs_point_id = reg9; 
ct_label.acl_id = reg2[[16..31]]; ct_label.network_function_group = 0; 
ct_label.network_function_group_id = 0; }; next;)
+  table=??(ls_out_stateful    ), priority=110  , match=(reg0[[1]] == 1 && 
reg0[[13]] == 0 && reg8[[21]] == 1), action=(ct_commit { ct_mark.blocked = 0; 
ct_mark.allow_established = reg0[[20]]; ct_label.acl_id = reg2[[16..31]]; 
ct_label.network_function_group = 1; ct_label.network_function_group_id = 
reg0[[22..29]]; }; next;)
+  table=??(ls_out_stateful    ), priority=110  , match=(reg0[[1]] == 1 && 
reg0[[13]] == 1 && reg8[[21]] == 1), action=(ct_commit { ct_mark.blocked = 0; 
ct_mark.allow_established = reg0[[20]]; ct_mark.obs_stage = reg8[[19..20]]; 
ct_mark.obs_collector_id = reg8[[8..15]]; ct_label.obs_point_id = reg9; 
ct_label.acl_id = reg2[[16..31]]; ct_label.network_function_group = 1; 
ct_label.network_function_group_id = reg0[[22..29]]; }; next;)
+  table=??(ls_out_stateful    ), priority=120  , match=(outport == "sw0-nf-p1" 
&& reg0[[13]] == 0), action=(ct_commit { ct_mark.blocked = 0; 
ct_mark.allow_established = reg0[[20]]; ct_label.acl_id = reg2[[16..31]]; 
ct_label.tun_if_id = reg5[[16..31]]; }; next;)
+  table=??(ls_out_stateful    ), priority=120  , match=(outport == "sw0-nf-p1" 
&& reg0[[13]] == 1), action=(ct_commit { ct_mark.blocked = 0; 
ct_mark.allow_established = reg0[[20]]; ct_label.acl_id = reg2[[16..31]]; 
ct_mark.obs_stage = reg8[[19..20]]; ct_mark.obs_collector_id = reg8[[8..15]]; 
ct_label.obs_point_id = reg9; ct_label.tun_if_id = reg5[[16..31]]; }; next;)
+  table=??(ls_out_stateful    ), priority=120  , match=(outport == "sw0-nf-p2" 
&& reg0[[13]] == 0), action=(ct_commit { ct_mark.blocked = 0; 
ct_mark.allow_established = reg0[[20]]; ct_label.acl_id = reg2[[16..31]]; 
ct_label.tun_if_id = reg5[[16..31]]; }; next;)
+  table=??(ls_out_stateful    ), priority=120  , match=(outport == "sw0-nf-p2" 
&& reg0[[13]] == 1), action=(ct_commit { ct_mark.blocked = 0; 
ct_mark.allow_established = reg0[[20]]; ct_label.acl_id = reg2[[16..31]]; 
ct_mark.obs_stage = reg8[[19..20]]; ct_mark.obs_collector_id = reg8[[8..15]]; 
ct_label.obs_point_id = reg9; ct_label.tun_if_id = reg5[[16..31]]; }; next;)
+])
+
 # ICMP packets from sw0-p1 should be redirected to sw0-nf-p1, but in revervse 
direction should not.
 flow_eth_from_p1='eth.src == 00:00:00:00:00:01 && eth.dst == 00:00:00:00:00:02'
 flow_ip_from_p1='ip.ttl==64 && ip4.src == 10.0.0.2 && ip4.dst == 10.0.0.3'
@@ -18309,10 +18321,10 @@ AT_CHECK(
   [grep -E 'ls_(in|out)_network_function' sw0flows | ovn_strip_lflows |  
sort], [0], [dnl
   table=??(ls_in_network_function), priority=0    , match=(1), action=(next;)
   table=??(ls_in_network_function), priority=1    , match=(reg8[[21]] == 1), 
action=(drop;)
-  table=??(ls_in_network_function), priority=100  , match=(inport == 
"sw0-nf-p1"), action=(next;)
-  table=??(ls_in_network_function), priority=100  , match=(inport == 
"sw0-nf-p2"), action=(next;)
-  table=??(ls_in_network_function), priority=100  , match=(inport == 
"sw0-nf-p3"), action=(next;)
-  table=??(ls_in_network_function), priority=100  , match=(inport == 
"sw0-nf-p4"), action=(next;)
+  table=??(ls_in_network_function), priority=100  , match=(inport == 
"sw0-nf-p1"), action=(reg5[[16..31]] = ct_label.tun_if_id; next;)
+  table=??(ls_in_network_function), priority=100  , match=(inport == 
"sw0-nf-p2"), action=(reg5[[16..31]] = ct_label.tun_if_id; next;)
+  table=??(ls_in_network_function), priority=100  , match=(inport == 
"sw0-nf-p3"), action=(reg5[[16..31]] = ct_label.tun_if_id; next;)
+  table=??(ls_in_network_function), priority=100  , match=(inport == 
"sw0-nf-p4"), action=(reg5[[16..31]] = ct_label.tun_if_id; next;)
   table=??(ls_in_network_function), priority=100  , match=(reg8[[21]] == 1 && 
eth.mcast), action=(next;)
   table=??(ls_in_network_function), priority=99   , match=(reg8[[21]] == 1 && 
reg8[[22]] == 0 && ct_label.network_function_group_id == 2), action=(outport = 
"sw0-nf-p3"; output;)
   table=??(ls_in_network_function), priority=99   , match=(reg8[[21]] == 1 && 
reg8[[22]] == 1 && reg0[[22..29]] == 1), action=(outport = "sw0-nf-p1"; output;)
@@ -18380,10 +18392,10 @@ AT_CHECK(
   [grep -E 'ls_(in|out)_network_function' sw1flows | ovn_strip_lflows | sort], 
[0], [dnl
   table=??(ls_in_network_function), priority=0    , match=(1), action=(next;)
   table=??(ls_in_network_function), priority=1    , match=(reg8[[21]] == 1), 
action=(drop;)
-  table=??(ls_in_network_function), priority=100  , match=(inport == 
"sw1-nf-p1"), action=(next;)
-  table=??(ls_in_network_function), priority=100  , match=(inport == 
"sw1-nf-p2"), action=(next;)
-  table=??(ls_in_network_function), priority=100  , match=(inport == 
"sw1-nf-p3"), action=(next;)
-  table=??(ls_in_network_function), priority=100  , match=(inport == 
"sw1-nf-p4"), action=(next;)
+  table=??(ls_in_network_function), priority=100  , match=(inport == 
"sw1-nf-p1"), action=(reg5[[16..31]] = ct_label.tun_if_id; next;)
+  table=??(ls_in_network_function), priority=100  , match=(inport == 
"sw1-nf-p2"), action=(reg5[[16..31]] = ct_label.tun_if_id; next;)
+  table=??(ls_in_network_function), priority=100  , match=(inport == 
"sw1-nf-p3"), action=(reg5[[16..31]] = ct_label.tun_if_id; next;)
+  table=??(ls_in_network_function), priority=100  , match=(inport == 
"sw1-nf-p4"), action=(reg5[[16..31]] = ct_label.tun_if_id; next;)
   table=??(ls_in_network_function), priority=100  , match=(reg8[[21]] == 1 && 
eth.mcast), action=(next;)
   table=??(ls_in_network_function), priority=99   , match=(reg8[[21]] == 1 && 
reg8[[22]] == 0 && ct_label.network_function_group_id == 2), action=(outport = 
"sw1-nf-p3"; output;)
   table=??(ls_in_network_function), priority=99   , match=(reg8[[21]] == 1 && 
reg8[[22]] == 1 && reg0[[22..29]] == 1), action=(outport = "sw1-nf-p1"; output;)
diff --git a/tests/ovn.at b/tests/ovn.at
index 31aa511f8..de69abb1c 100644
--- a/tests/ovn.at
+++ b/tests/ovn.at
@@ -147,6 +147,8 @@ ct_label.network_function_group = ct_label[7]
 ct_label.network_function_group_id = ct_label[17..24]
 ct_label.obs_point_id = ct_label[96..127]
 ct_label.obs_unused = ct_label[0..95]
+ct_label.tun_if = ct_label[8]
+ct_label.tun_if_id = ct_label[80..95]
 ct_mark = NXM_NX_CT_MARK
 ct_mark.allow_established = ct_mark[6]
 ct_mark.blocked = ct_mark[0]
@@ -44010,133 +44012,389 @@ AT_CLEANUP
 ])
 
 OVN_FOR_EACH_NORTHD([
-AT_SETUP([Network function packet flow])
+AT_SETUP([Network function packet flow - outbound])
 AT_KEYWORDS([ovn])
 ovn_start
 
-check ovn-nbctl ls-add sw0
+# Create logical topology. One LS sw0 with 4 ports.
+# From-lport ACL rule directs request packets from sw0-p1 to sw0-p2 via NF 
port {sw0-nf-p1, sw0-nf-p2}
+# Response packets from sw0-p2 to sw0-p1 redirected via NF ports in reverse 
order.
+create_logical_topology() {
+    sw=$1
+    check ovn-nbctl ls-add $sw
+    for i in 1 2; do
+        check ovn-nbctl lsp-add $sw $sw-p$i -- lsp-set-addresses $sw-p$i 
"f0:00:00:00:00:0$i 192.168.0.1$i"
+        check ovn-nbctl lsp-add $sw $sw-nf-p$i -- lsp-set-addresses $sw-nf-p$i 
"f0:00:00:00:01:0$i"
+    done
+    check ovn-nbctl set logical_switch_port $sw-nf-p1 \
+        options:receive_multicast=false options:lsp_learn_mac=false \
+        options:is-nf=true options:nf-linked-port=$sw-nf-p2
+    check ovn-nbctl set logical_switch_port $sw-nf-p2 \
+        options:receive_multicast=false options:lsp_learn_mac=false \
+        options:is-nf=true options:nf-linked-port=$sw-nf-p1
+    check ovn-nbctl nf-add nf0 $sw-nf-p1 $sw-nf-p2
+    check ovn-nbctl nfg-add nfg0 1 inline nf0
+    check ovn-nbctl pg-add pg0 $sw-p1
+    check ovn-nbctl acl-add pg0 from-lport 1002 "inport == @pg0 && ip4.dst == 
192.168.0.12" allow-related nfg0
+}
 
-net_add n1
-sim_add hv1
-as hv1
-ovs-vsctl add-br br-phys
-ovn_attach n1 br-phys 192.168.0.1
+create_logical_topology sw0
+
+# Create three hypervisors
+net_add n
 for i in 1 2 3; do
-    ovs-vsctl add-port br-int vif$i -- \
-        set interface vif$i external-ids:iface-id=sw0-p$i \
-        options:tx_pcap=hv1/vif$i-tx.pcap \
-        options:rxq_pcap=hv1/vif$i-rx.pcap
-    check ovn-nbctl lsp-add sw0 sw0-p$i -- lsp-set-addresses sw0-p$i 
"f0:00:00:00:00:0$i 192.168.0.1$i"
+    sim_add hv$i
+    as hv$i
+    ovs-vsctl add-br br-phys
+    ovs-vsctl set open . external-ids:ovn-bridge-mappings=phys:br-phys
+    ovn_attach n br-phys 192.168.1.$i
 done
 
-for i in 1 2; do
-    ovs-vsctl add-port br-int vif-nf$i -- \
-        set interface vif-nf$i external-ids:iface-id=sw0-nf-p$i \
-        options:tx_pcap=hv1/vif-nf$i-tx.pcap \
-        options:rxq_pcap=hv1/vif-nf$i-rx.pcap
-    check ovn-nbctl lsp-add sw0 sw0-nf-p$i -- lsp-set-addresses sw0-nf-p$i 
"f0:00:00:00:01:0$i"
-done
-check ovn-nbctl set logical_switch_port sw0-nf-p1 
options:receive_multicast=false options:lsp_learn_mac=false 
options:network-function=true options:network-function-linked-port=sw0-nf-p2
-check ovn-nbctl set logical_switch_port sw0-nf-p2 
options:receive_multicast=false options:lsp_learn_mac=false 
options:network-function=true options:network-function-linked-port=sw0-nf-p1
-check ovn-nbctl nf-add nf0 sw0-nf-p1 sw0-nf-p2
-check ovn-nbctl nfg-add nfg0 1 inline nf0
-check ovn-nbctl pg-add pg0 sw0-p1
-check ovn-nbctl acl-add pg0 from-lport 1002 "inport == @pg0 && ip4.dst == 
192.168.0.12" allow-related nfg0
-
-OVN_POPULATE_ARP
-
-wait_for_ports_up
-check ovn-nbctl --wait=hv sync
-
 test_icmp() {
-    local inport=$1 src_mac=$2 dst_mac=$3 src_ip=$4 dst_ip=$5 icmp_type=$6 
outport=$7
+    local inport=$1 src_mac=$2 dst_mac=$3 src_ip=$4 dst_ip=$5 icmp_type=$6 
outport=$7 in_hv=$8 out_hv=$9
     local packet="inport==\"$inport\" && eth.src==$src_mac &&
                   eth.dst==$dst_mac && ip.ttl==64 && ip4.src==$src_ip
                   && ip4.dst==$dst_ip && icmp4.type==$icmp_type &&
                   icmp4.code==0"
     shift; shift; shift; shift; shift; shift
-    OVS_WAIT_UNTIL([as hv1 ovs-appctl -t ovn-controller inject-pkt "$packet"])
+    OVS_WAIT_UNTIL([as $in_hv ovs-appctl -t ovn-controller inject-pkt 
"$packet"])
     echo "INJECTED PACKET $packet"
-    echo $packet | ovstest test-ovn expr-to-packets >> $outport.expected
+    echo $packet | ovstest test-ovn expr-to-packets >> 
$out_hv-$outport.expected
+}
+
+packet_redirection_test() {
+    local hvp1=$1 hvp2=$2 hvnf=$3
+    # Inject ICMP request packet from sw0-p1 and make sure it is being 
redirected to the nf ingress port.
+    test_icmp sw0-p1 "f0:00:00:00:00:01" "f0:00:00:00:00:02" "192.168.0.11" 
"192.168.0.12" 8 vif-nf1 $hvp1 $hvnf
+    OVN_CHECK_PACKETS_REMOVE_BROADCAST([$hvnf/vif-nf1-tx.pcap], 
[$hvnf-vif-nf1.expected])
+
+    # Forward same packet from nf egress port and make sure it is reaching 
sw0-p2.
+    test_icmp sw0-nf-p2 "f0:00:00:00:00:01" "f0:00:00:00:00:02" "192.168.0.11" 
"192.168.0.12" 8 vif2 $hvnf $hvp2
+    OVN_CHECK_PACKETS_REMOVE_BROADCAST([$hvp2/vif2-tx.pcap], 
[$hvp2-vif2.expected])
+
+    # Send response from sw0-p2 and check that it is being redirected to nf 
egress port.
+    test_icmp sw0-p2 "f0:00:00:00:00:02" "f0:00:00:00:00:01" "192.168.0.12" 
"192.168.0.11" 0 vif-nf2 $hvp2 $hvnf
+    OVN_CHECK_PACKETS_REMOVE_BROADCAST([$hvnf/vif-nf2-tx.pcap], 
[$hvnf-vif-nf2.expected])
+
+    # Forward same response packet from nf ingress port and make sure it is 
reaching sw0-p1.
+    test_icmp sw0-nf-p1 "f0:00:00:00:00:02" "f0:00:00:00:00:01" "192.168.0.12" 
"192.168.0.11" 0 vif1 $hvnf $hvp1
+    OVN_CHECK_PACKETS_REMOVE_BROADCAST([$hvp1/vif1-tx.pcap], 
[$hvp1-vif1.expected])
+
+    # Reverse direction packet should flow normally without redirection.
+    # Send ICMP request from sw0-p2 destined to sw0-p1.
+    test_icmp sw0-p2 "f0:00:00:00:00:02" "f0:00:00:00:00:01" "192.168.0.12" 
"192.168.0.11" 8 vif1 $hvp2 $hvp1
+    OVN_CHECK_PACKETS_REMOVE_BROADCAST([$hvp1/vif1-tx.pcap], 
[$hvp1-vif1.expected])
+    # Send ICMP response from sw0-p1 destined to sw0-p2.
+    test_icmp sw0-p1 "f0:00:00:00:00:01" "f0:00:00:00:00:02" "192.168.0.11" 
"192.168.0.12" 0 vif2 $hvp1 $hvp2
+    OVN_CHECK_PACKETS_REMOVE_BROADCAST([$hvp2/vif2-tx.pcap], 
[$hvp2-vif2.expected])
+}
+
+create_port_binding() {
+    hvp1=$1 hvp2=$2 hvnf=$3
+    as $hvp1
+    ovs-vsctl add-port br-int vif1 -- \
+        set interface vif1 external-ids:iface-id=sw0-p1 \
+        options:tx_pcap=$hvp1/vif1-tx.pcap \
+        options:rxq_pcap=$hvp1/vif1-rx.pcap
+    as $hvp2
+    ovs-vsctl add-port br-int vif2 -- \
+        set interface vif2 external-ids:iface-id=sw0-p2 \
+        options:tx_pcap=$hvp2/vif2-tx.pcap \
+        options:rxq_pcap=$hvp2/vif2-rx.pcap
+    as $hvnf
+    for i in 1 2; do
+        ovs-vsctl add-port br-int vif-nf$i -- \
+            set interface vif-nf$i external-ids:iface-id=sw0-nf-p$i \
+            options:tx_pcap=$hvnf/vif-nf$i-tx.pcap \
+            options:rxq_pcap=$hvnf/vif-nf$i-rx.pcap
+    done
+
+    OVN_POPULATE_ARP
+    wait_for_ports_up
+    check ovn-nbctl --wait=hv sync
+    sleep 1
 }
-# Inject ICMP request packet from sw0-p1 and make sure it is being redirected 
to the nf ingress port.
-test_icmp sw0-p1 "f0:00:00:00:00:01" "f0:00:00:00:00:02" "192.168.0.11" 
"192.168.0.12" 8 vif-nf1
-OVN_CHECK_PACKETS_REMOVE_BROADCAST([hv1/vif-nf1-tx.pcap], [vif-nf1.expected])
 
-# Forward same packet from nf egress port and make sure it is reaching sw0-p2.
-test_icmp sw0-nf-p2 "f0:00:00:00:00:01" "f0:00:00:00:00:02" "192.168.0.11" 
"192.168.0.12" 8 vif2
-OVN_CHECK_PACKETS_REMOVE_BROADCAST([hv1/vif2-tx.pcap], [vif2.expected])
+cleanup_port_binding() {
+    hvp1=$1 hvp2=$2 hvnf=$3
+    as $hvp1
+    ovs-vsctl del-port br-int vif1
+    as $hvp2
+    ovs-vsctl del-port br-int vif2
+    as $hvnf
+    for i in 1 2; do
+        ovs-vsctl del-port br-int vif-nf$i
+    done
+    sleep 1
+}
 
-# Send response from sw0-p2 and check that it is being redirected to nf egress 
port.
-test_icmp sw0-p2 "f0:00:00:00:00:02" "f0:00:00:00:00:01" "192.168.0.12" 
"192.168.0.11" 0 vif-nf2
-OVN_CHECK_PACKETS_REMOVE_BROADCAST([hv1/vif-nf2-tx.pcap], [vif-nf2.expected])
+test_nf_with_multinodes_outbound() {
+    mode=$1
+    # Test 1: Bind all 4 ports to one node
+    echo "$mode: Network function outbound with single node"
+    create_port_binding hv1 hv1 hv1
 
-# Forward same response packet from nf ingress port and make sure it is 
reaching sw0-p1.
-test_icmp sw0-nf-p1 "f0:00:00:00:00:02" "f0:00:00:00:00:01" "192.168.0.12" 
"192.168.0.11" 0 vif1
-OVN_CHECK_PACKETS_REMOVE_BROADCAST([hv1/vif1-tx.pcap], [vif1.expected])
+    packet_redirection_test hv1 hv1 hv1 sw0
 
-# Reverse direction packet should flow normally without redirection.
-# Send ICMP request from sw0-p2 destined to sw0-p1.
-test_icmp sw0-p2 "f0:00:00:00:00:02" "f0:00:00:00:00:01" "192.168.0.12" 
"192.168.0.11" 8 vif1
-OVN_CHECK_PACKETS_REMOVE_BROADCAST([hv1/vif1-tx.pcap], [vif1.expected])
-# Send ICMP response from sw0-p1 destined to sw0-p2.
-test_icmp sw0-p1 "f0:00:00:00:00:01" "f0:00:00:00:00:02" "192.168.0.11" 
"192.168.0.12" 0 vif2
-OVN_CHECK_PACKETS_REMOVE_BROADCAST([hv1/vif2-tx.pcap], [vif2.expected])
+    cleanup_port_binding hv1 hv1 hv1
 
+    # Test 2: src & dst ports on one node, NF on another node
+    # sw0-p1, sw0-p2 on hv1, NF ports on hv2
+    echo "$mode: Network function outbound with two nodes - nf separate"
+    create_port_binding hv1 hv1 hv2
 
-OVN_CLEANUP([hv1])
+    packet_redirection_test hv1 hv1 hv2 sw0
+
+    cleanup_port_binding hv1 hv1 hv2
+
+    # Test 3: src and nf on one node, dst on a second node
+    echo "$mode: Network function outbound with two nodes - nf with src"
+    create_port_binding hv1 hv2 hv1
+
+    packet_redirection_test hv1 hv2 hv1 sw0
+
+    cleanup_port_binding hv1 hv2 hv1
+
+    # Test 4: src on one node, nf & dst on a second node
+    echo "$mode: Network function outbound with two nodes - nf with dst"
+    create_port_binding hv1 hv2 hv2
+
+    packet_redirection_test hv1 hv2 hv2 sw0
+
+    cleanup_port_binding hv1 hv2 hv2
+
+    # Test 5: src on one node, dst on another, NF on a 3rd one
+    # sw0-p1 on hv1, sw0-p2 on hv2, NF ports on hv3
+    echo "$mode: Network function outbound with three nodes"
+    create_port_binding hv1 hv2 hv3
+
+    packet_redirection_test hv1 hv2 hv3 sw0
+
+    cleanup_port_binding hv1 hv2 hv3
+}
+
+test_nf_with_multinodes_outbound overlay
+
+# Tests for VLAN network
+# Add localnet port to make it a VLAN backed LS
+check ovn-nbctl lsp-add sw0 ln0 "" 100
+check ovn-nbctl lsp-set-addresses ln0 unknown
+check ovn-nbctl lsp-set-type ln0 localnet
+check ovn-nbctl lsp-set-options ln0 network_name=phys
+
+test_nf_with_multinodes_outbound VLAN
+
+# Cleanup logical topology
+check ovn-nbctl lsp-del ln0
+check ovn-nbctl acl-del pg0 from-lport 1002 "inport == @pg0 && ip4.dst == 
192.168.0.12"
+check ovn-nbctl pg-del pg0
+check ovn-nbctl nfg-del nfg0
+check ovn-nbctl nf-del nf0
+check ovn-nbctl clear logical_switch_port sw0-nf-p1 options
+check ovn-nbctl clear logical_switch_port sw0-nf-p2 options
+for i in 1 2; do
+    check ovn-nbctl lsp-del sw0-p$i
+    check ovn-nbctl lsp-del sw0-nf-p$i
+done
+check ovn-nbctl ls-del sw0
+check ovn-nbctl --wait=hv sync
+
+OVN_CLEANUP([hv1],[hv2],[hv3])
 AT_CLEANUP
 ])
 
 OVN_FOR_EACH_NORTHD([
-AT_SETUP([Network function TCP packet flow])
+AT_SETUP([Network function packet flow - inbound])
 AT_KEYWORDS([ovn])
 ovn_start
 
-check ovn-nbctl ls-add sw0
+# Create logical topology. One LS sw0 with 4 ports.
+# to-lport ACL rule directs request packets from sw0-p2 to sw0-p1 via NF port 
{sw0-nf-p1, sw0-nf-p2}
+# Response packets from sw0-p1 to sw0-p2 redirected via NF ports in reverse 
order.
+create_logical_topology() {
+    sw=$1
+    check ovn-nbctl ls-add $sw
+    for i in 1 2; do
+        check ovn-nbctl lsp-add $sw $sw-p$i -- lsp-set-addresses $sw-p$i 
"f0:00:00:00:00:0$i 192.168.0.1$i"
+        check ovn-nbctl lsp-add $sw $sw-nf-p$i -- lsp-set-addresses $sw-nf-p$i 
"f0:00:00:00:01:0$i"
+    done
+    check ovn-nbctl set logical_switch_port $sw-nf-p1 \
+        options:receive_multicast=false options:lsp_learn_mac=false \
+        options:is-nf=true options:nf-linked-port=$sw-nf-p2
+    check ovn-nbctl set logical_switch_port $sw-nf-p2 \
+        options:receive_multicast=false options:lsp_learn_mac=false \
+        options:is-nf=true options:nf-linked-port=$sw-nf-p1
+    check ovn-nbctl nf-add nf0 $sw-nf-p1 $sw-nf-p2
+    check ovn-nbctl nfg-add nfg0 1 inline nf0
+    check ovn-nbctl pg-add pg0 $sw-p1
+    check ovn-nbctl acl-add pg0 to-lport 1002 "outport == @pg0 && ip4.src == 
192.168.0.12" allow-related nfg0
+}
 
-net_add n1
-sim_add hv1
-as hv1
-ovs-vsctl add-br br-phys
-ovn_attach n1 br-phys 192.168.0.1
+create_logical_topology sw0
+
+# Create three hypervisors
+net_add n
 for i in 1 2 3; do
-    ovs-vsctl add-port br-int vif$i -- \
-        set interface vif$i external-ids:iface-id=sw0-p$i \
-        options:tx_pcap=hv1/vif$i-tx.pcap \
-        options:rxq_pcap=hv1/vif$i-rx.pcap
-    check ovn-nbctl lsp-add sw0 sw0-p$i -- lsp-set-addresses sw0-p$i 
"f0:00:00:00:00:0$i 192.168.0.1$i"
+    sim_add hv$i
+    as hv$i
+    ovs-vsctl add-br br-phys
+    ovs-vsctl set open . external-ids:ovn-bridge-mappings=phys:br-phys
+    ovn_attach n br-phys 192.168.1.$i
 done
 
-check ovn-nbctl pg-add pg0 sw0-p1
-check ovn-nbctl acl-add pg0 from-lport 1002 "inport == @pg0 && ip4.dst == 
192.168.0.12" allow-related
-check ovn-nbctl acl-add pg0 to-lport 1002 "outport == @pg0 && ip4.src == 
192.168.0.12" drop
+test_icmp() {
+    local inport=$1 src_mac=$2 dst_mac=$3 src_ip=$4 dst_ip=$5 icmp_type=$6 
outport=$7 in_hv=$8 out_hv=$9
+    local packet="inport==\"$inport\" && eth.src==$src_mac &&
+                  eth.dst==$dst_mac && ip.ttl==64 && ip4.src==$src_ip
+                  && ip4.dst==$dst_ip && icmp4.type==$icmp_type &&
+                  icmp4.code==0"
+    shift; shift; shift; shift; shift; shift
+    OVS_WAIT_UNTIL([as $in_hv ovs-appctl -t ovn-controller inject-pkt 
"$packet"])
+    echo "INJECTED PACKET $packet"
+    echo $packet | ovstest test-ovn expr-to-packets >> 
$out_hv-$outport.expected
+}
+
+packet_redirection_test() {
+    local hvp1=$1 hvp2=$2 hvnf=$3
+    # Inject ICMP request packet from sw0-p2 and make sure it is being 
redirected to the nf outport.
+    test_icmp sw0-p2 "f0:00:00:00:00:02" "f0:00:00:00:00:01" "192.168.0.12" 
"192.168.0.11" 8 vif-nf2 $hvp2 $hvnf
+    OVN_CHECK_PACKETS_REMOVE_BROADCAST([$hvnf/vif-nf2-tx.pcap], 
[$hvnf-vif-nf2.expected])
+
+    # Forward same packet from nf inport and make sure it is reaching sw0-p1.
+    test_icmp sw0-nf-p1 "f0:00:00:00:00:02" "f0:00:00:00:00:01" "192.168.0.12" 
"192.168.0.11" 8 vif1 $hvnf $hvp1
+    OVN_CHECK_PACKETS_REMOVE_BROADCAST([$hvp1/vif1-tx.pcap], 
[$hvp1-vif1.expected])
+
+    # Send response from sw0-p1 and check that it is being redirected to nf 
inport.
+    test_icmp sw0-p1 "f0:00:00:00:00:01" "f0:00:00:00:00:02" "192.168.0.11" 
"192.168.0.12" 0 vif-nf1 $hvp1 $hvnf
+    OVN_CHECK_PACKETS_REMOVE_BROADCAST([$hvnf/vif-nf1-tx.pcap], 
[$hvnf-vif-nf1.expected])
+
+    # Forward same response packet from nf outport and make sure it is 
reaching sw0-p2.
+    test_icmp sw0-nf-p1 "f0:00:00:00:00:01" "f0:00:00:00:00:02" "192.168.0.11" 
"192.168.0.12" 0 vif2 $hvnf $hvp2
+    OVN_CHECK_PACKETS_REMOVE_BROADCAST([$hvp2/vif2-tx.pcap], 
[$hvp2-vif2.expected])
+
+    # Reverse direction packet should flow normally without redirection.
+    # Send ICMP request from sw0-p1 destined to sw0-p2.
+    test_icmp sw0-p1 "f0:00:00:00:00:01" "f0:00:00:00:00:02" "192.168.0.11" 
"192.168.0.12" 8 vif2 $hvp1 $hvp2
+    OVN_CHECK_PACKETS_REMOVE_BROADCAST([$hvp2/vif2-tx.pcap], 
[$hvp2-vif2.expected])
+    # Send ICMP response from sw0-p2 destined to sw0-p1.
+    test_icmp sw0-p2 "f0:00:00:00:00:02" "f0:00:00:00:00:01" "192.168.0.12" 
"192.168.0.11" 0 vif1 $hvp2 $hvp1
+    OVN_CHECK_PACKETS_REMOVE_BROADCAST([$hvp1/vif1-tx.pcap], 
[$hvp1-vif1.expected])
+}
+
+create_port_binding() {
+    hvp1=$1 hvp2=$2 hvnf=$3
+    as $hvp1
+    ovs-vsctl add-port br-int vif1 -- \
+        set interface vif1 external-ids:iface-id=sw0-p1 \
+        options:tx_pcap=$hvp1/vif1-tx.pcap \
+        options:rxq_pcap=$hvp1/vif1-rx.pcap
+    as $hvp2
+    ovs-vsctl add-port br-int vif2 -- \
+        set interface vif2 external-ids:iface-id=sw0-p2 \
+        options:tx_pcap=$hvp2/vif2-tx.pcap \
+        options:rxq_pcap=$hvp2/vif2-rx.pcap
+    as $hvnf
+    for i in 1 2; do
+        ovs-vsctl add-port br-int vif-nf$i -- \
+            set interface vif-nf$i external-ids:iface-id=sw0-nf-p$i \
+            options:tx_pcap=$hvnf/vif-nf$i-tx.pcap \
+            options:rxq_pcap=$hvnf/vif-nf$i-rx.pcap
+    done
 
-OVN_POPULATE_ARP
+    OVN_POPULATE_ARP
+    wait_for_ports_up
+    check ovn-nbctl --wait=hv sync
+    sleep 1
+}
 
-wait_for_ports_up
-check ovn-nbctl --wait=hv sync
+cleanup_port_binding() {
+    hvp1=$1 hvp2=$2 hvnf=$3
+    as $hvp1
+    ovs-vsctl del-port br-int vif1
+    as $hvp2
+    ovs-vsctl del-port br-int vif2
+    as $hvnf
+    for i in 1 2; do
+        ovs-vsctl del-port br-int vif-nf$i
+    done
+    check ovn-nbctl --wait=hv sync
+    sleep 1
+}
 
-test_tcp() {
-    local inport=$1 outport=$2 src_mac=$3 dst_mac=$4 src_ip=$5 dst_ip=$6 
src_port=$7 dst_port=$8 flags=$9
-    local packet="inport==\"$inport\" && eth.src==$src_mac &&
-                  eth.dst==$dst_mac && ip.ttl==64 && ip4.src==$src_ip
-                  && ip4.dst==$dst_ip && tcp && tcp.src==$src_port && 
tcp.dst==$dst_port
-                  && tcp.dst==$dst_port && tcp.flags==$flags"
-    shift; shift; shift; shift; shift; shift; shift; shift; shift
-    OVS_WAIT_UNTIL([as hv1 ovs-appctl -t ovn-controller inject-pkt "$packet"])
-    echo "INJECTED TCP PACKET $packet"
-    echo $packet | ovstest test-ovn expr-to-packets >> $outport.expected
-}
-# Send TCP SYN from p1 to p2.
-echo "Sending TCP SYN from p1 to p2"
-test_tcp sw0-p1 vif2 "f0:00:00:00:00:01" "f0:00:00:00:00:02" "192.168.0.11" 
"192.168.0.12" 1000 80 2
-OVN_CHECK_PACKETS_REMOVE_BROADCAST([hv1/vif2-tx.pcap], [vif2.expected])
-# Send TCP SYN-ACK from p2 to p1.
-echo "Sending TCP SYN-ACK from p2 to p1"
-test_tcp sw0-p2 vif1 "f0:00:00:00:00:02" "f0:00:00:00:00:01" "192.168.0.12" 
"192.168.0.11" 80 1000 18
-OVN_CHECK_PACKETS_REMOVE_BROADCAST([hv1/vif1-tx.pcap], [vif1.expected])
+test_nf_with_multinodes_inbound() {
+    mode=$1
 
-OVN_CLEANUP([hv1])
+    # Test 1: Bind all 4 ports to one node
+    echo "$mode: Network function inbound with single node"
+    create_port_binding hv1 hv1 hv1
+
+    packet_redirection_test hv1 hv1 hv1 sw0
+
+    cleanup_port_binding hv1 hv1 hv1
+
+    # Test 2: src & dst ports on one node, NF on another node
+    # sw0-p1, sw0-p2 on hv1, NF ports on hv2
+    echo "$mode: Network function inbound with two nodes - nf separate"
+    create_port_binding hv1 hv1 hv2
+
+    packet_redirection_test hv1 hv1 hv2 sw0
+
+    cleanup_port_binding hv1 hv1 hv2
+
+    # Test 3: dst and nf on one node, src on a second node
+    # sw0-p1 & NF ports on hv1, sw0-p2 on hv2
+    echo "$mode: Network function inbound with two nodes - nf with dst"
+    create_port_binding hv1 hv2 hv1
+
+    packet_redirection_test hv1 hv2 hv1 sw0
+
+    cleanup_port_binding hv1 hv2 hv1
+
+    # Test 4: dst on one node, nf & src on a second node
+    # sw0-p1 on hv1, sw0-p2 & NF ports on hv2
+    echo "$mode: Network function inbound with two nodes - nf with src"
+    create_port_binding hv1 hv2 hv2
+
+    packet_redirection_test hv1 hv2 hv2 sw0
+
+    cleanup_port_binding hv1 hv2 hv2
+
+    # Test 5: src on one node, dst on another, NF on a 3rd one
+    # sw0-p1 on hv1, sw0-p2 on hv2, NF ports on hv3
+    echo "$mode: Network function inbound with three nodes"
+    create_port_binding hv1 hv2 hv3
+
+    packet_redirection_test hv1 hv2 hv3 sw0
+
+    cleanup_port_binding hv1 hv2 hv3
+}
+
+test_nf_with_multinodes_inbound overlay
+
+# Tests for VLAN network
+# Add localnet port to make it a VLAN backed LS
+check ovn-nbctl lsp-add sw0 ln0 "" 100
+check ovn-nbctl lsp-set-addresses ln0 unknown
+check ovn-nbctl lsp-set-type ln0 localnet
+check ovn-nbctl lsp-set-options ln0 network_name=phys
+
+test_nf_with_multinodes_inbound VLAN
+
+# Cleanup logical topology
+check ovn-nbctl lsp-del ln0
+check ovn-nbctl acl-del pg0 to-lport 1002 "outport == @pg0 && ip4.src == 
192.168.0.12"
+check ovn-nbctl pg-del pg0
+check ovn-nbctl nfg-del nfg0
+check ovn-nbctl nf-del nf0
+check ovn-nbctl clear logical_switch_port sw0-nf-p1 options
+check ovn-nbctl clear logical_switch_port sw0-nf-p2 options
+for i in 1 2; do
+    check ovn-nbctl lsp-del sw0-p$i
+    check ovn-nbctl lsp-del sw0-nf-p$i
+done
+check ovn-nbctl ls-del sw0
+check ovn-nbctl --wait=hv sync
+
+OVN_CLEANUP([hv1],[hv2],[hv3])
 AT_CLEANUP
 ])
diff --git a/tests/system-ovn.at b/tests/system-ovn.at
index 8e356df6f..6aef8ec80 100644
--- a/tests/system-ovn.at
+++ b/tests/system-ovn.at
@@ -18484,3 +18484,167 @@ OVS_TRAFFIC_VSWITCHD_STOP(["/failed to query port 
patch-.*/d
 /connection dropped.*/d"])
 AT_CLEANUP
 ])
+
+OVN_FOR_EACH_NORTHD([
+AT_SETUP([Network Function])
+AT_SKIP_IF([test $HAVE_TCPDUMP = no])
+ovn_start
+OVS_TRAFFIC_VSWITCHD_START()
+
+ADD_BR([br-int])
+
+# Set external-ids in br-int needed for ovn-controller
+ovs-vsctl \
+        -- set Open_vSwitch . external-ids:system-id=hv1 \
+        -- set Open_vSwitch . 
external-ids:ovn-remote=unix:$ovs_base/ovn-sb/ovn-sb.sock \
+        -- set Open_vSwitch . external-ids:ovn-encap-type=geneve \
+        -- set Open_vSwitch . external-ids:ovn-encap-ip=169.0.0.1 \
+        -- set bridge br-int fail-mode=secure other-config:disable-in-band=true
+
+# Start ovn-controller
+start_daemon ovn-controller
+
+ADD_NAMESPACES(client)
+ADD_VETH(client, client, br-int, "192.168.1.10/24", "f0:00:00:01:02:10")
+ADD_NAMESPACES(server)
+ADD_VETH(server, server, br-int, "192.168.1.20/24", "f0:00:00:01:02:20")
+ADD_NAMESPACES(nf)
+ADD_VETH(nf-p1, nf, br-int, "0", "f0:00:00:01:02:30")
+ADD_VETH(nf-p2, nf, br-int, "0", "f0:00:00:01:02:40")
+ADD_VETH(nf-p3, nf, br-int, "0", "f0:00:00:01:02:50")
+ADD_VETH(nf-p4, nf, br-int, "0", "f0:00:00:01:02:60")
+
+check ovn-nbctl ls-add sw0
+check ovn-nbctl lsp-add sw0 client \
+    -- lsp-set-addresses client "f0:00:00:01:02:10 192.168.1.10/24"
+check ovn-nbctl lsp-add sw0 server \
+    -- lsp-set-addresses server "f0:00:00:01:02:20 192.168.1.20/24"
+check ovn-nbctl ls-add nf
+check ovn-nbctl lsp-add nf nf-p1
+check ovn-nbctl lsp-add nf nf-p2
+check ovn-nbctl lsp-add nf nf-p3
+check ovn-nbctl lsp-add nf nf-p4
+check ovn-nbctl set logical_switch_port nf-p1 options:receive_multicast=false 
options:lsp_learn_fdb=false \
+                                              options:is-nf=true 
options:nf-linked-port=nf-p2
+check ovn-nbctl set logical_switch_port nf-p2 options:receive_multicast=false 
options:lsp_learn_fdb=false \
+                                              options:is-nf=true 
options:nf-linked-port=nf-p1
+check ovn-nbctl set logical_switch_port nf-p3 options:receive_multicast=false 
options:lsp_learn_fdb=false \
+                                              options:is-nf=true 
options:nf-linked-port=nf-p4
+check ovn-nbctl set logical_switch_port nf-p4 options:receive_multicast=false 
options:lsp_learn_fdb=false \
+                                              options:is-nf=true 
options:nf-linked-port=nf-p3
+check ovn-nbctl lsp-add sw0 child-1 nf-p1 100
+check ovn-nbctl lsp-add sw0 child-2 nf-p2 100
+check ovn-nbctl lsp-add sw0 child-3 nf-p3 100
+check ovn-nbctl lsp-add sw0 child-4 nf-p4 100
+check ovn-nbctl set logical_switch_port child-1 
options:receive_multicast=false options:lsp_learn_fdb=false \
+                                                options:is-nf=true 
options:nf-linked-port=child-2
+check ovn-nbctl set logical_switch_port child-2 
options:receive_multicast=false options:lsp_learn_fdb=false \
+                                                options:is-nf=true 
options:nf-linked-port=child-1
+check ovn-nbctl set logical_switch_port child-3 
options:receive_multicast=false options:lsp_learn_fdb=false \
+                                                options:is-nf=true 
options:nf-linked-port=child-4
+check ovn-nbctl set logical_switch_port child-4 
options:receive_multicast=false options:lsp_learn_fdb=false \
+                                                options:is-nf=true 
options:nf-linked-port=child-3
+
+AS_BOX([Test-1: Single NF without health check])
+
+check ovn-nbctl nf-add nf0 nf-p1 nf-p2
+nf0_uuid=$(fetch_column nb:network_function _uuid name=nf0)
+check ovn-nbctl nfg-add nfg0 1 inline nf0
+
+check ovn-nbctl pg-add pg0 server
+check ovn-nbctl acl-add pg0 from-lport 1001 "inport == @pg0 && ip4.dst == 
192.168.1.10" allow-related nfg0
+check ovn-nbctl acl-add pg0 to-lport 1002 "outport == @pg0 && ip4.src == 
192.168.1.10" allow-related nfg0
+
+check ovn-nbctl --wait=hv sync
+
+# configure bridge inside nf namespace for nf0 to simulate NF behaviour
+NS_CHECK_EXEC([nf], [ip link add name br0 type bridge])
+NS_CHECK_EXEC([nf], [ip link set dev nf-p1 master br0])
+NS_CHECK_EXEC([nf], [ip link set dev nf-p2 master br0])
+NS_CHECK_EXEC([nf], [ip link set dev br0 up])
+
+validate_traffic() {
+    send_data=$1; recv_data=$2; pkt_cnt=$3;
+    AT_CHECK([printf "$send_data\n" > /tmp/nffifo], [0], [dnl
+])
+
+    if [[ -n "$recv_data" ]]; then
+        OVS_WAIT_FOR_OUTPUT_UNQUOTED([cat output.txt], [0], [dnl
+$recv_data
+])
+    else
+        OVS_WAIT_FOR_OUTPUT([cat output.txt], [0], [dnl
+])
+    fi
+
+    : > output.txt
+
+    OVS_WAIT_UNTIL([
+        total_pkts=$(cat pkt.pcap | wc -l)
+        test ${total_pkts} -ge ${pkt_cnt}
+    ])
+}
+
+validate_single_nf_no_health_check() {
+    client_ns=$1; server_ns=$2; sip=$3; direction=$4
+
+    # Start a TCP server
+    NETNS_DAEMONIZE($server_ns, [server.py -i $sip -p 10000], [server.pid])
+    on_exit 'kill $(cat server.pid)'
+
+    # Ensure TCP server is ready for connections
+    OVS_WAIT_FOR_OUTPUT([cat output.txt], [0], [dnl
+Server Ready
+])
+    : > output.txt
+
+    # Make a FIFO and send its output to a server
+    mkfifo /tmp/nffifo
+    on_exit 'rm -rf /tmp/nffifo'
+
+    NETNS_DAEMONIZE($client_ns, [client.py -f "/tmp/nffifo" -i $sip -p 10000], 
[client.pid])
+    on_exit 'kill $(cat client.pid)'
+
+    AS_BOX([$direction: Verify traffic forwarding through single NF without 
health check])
+
+    # Capture traffic on nf0
+    NS_CHECK_EXEC([nf], [tcpdump -l -nvv -i nf-p1 tcp > pkt.pcap 2>tcpdump_err 
&])
+    OVS_WAIT_UNTIL([grep "listening" tcpdump_err])
+    on_exit 'kill $(pidof tcpdump)'
+
+    # Verify no service monitors exist when health check is not configured
+    #AT_CHECK([ovn-sbctl list service_monitor | grep -v "^$"], [1])
+    AT_CHECK([ovn-sbctl list service_monitor | wc -l], [0], [dnl
+0
+])
+
+    validate_traffic "test" "test" 5
+
+    kill $(pidof tcpdump)
+    kill $(cat client.pid)
+    kill $(cat server.pid)
+    rm -f client.pid server.pid /tmp/nffifo
+}
+
+AS_BOX([Verify inbound traffic forwarding through NF without health check])
+validate_single_nf_no_health_check "client" "server" "192.168.1.20" "Inbound"
+AS_BOX([Verify outbound traffic forwarding through NF without health check])
+validate_single_nf_no_health_check "server" "client" "192.168.1.10" "Outbound"
+
+OVN_CLEANUP_CONTROLLER([hv1])
+
+as ovn-sb
+OVS_APP_EXIT_AND_WAIT([ovsdb-server])
+
+as ovn-nb
+OVS_APP_EXIT_AND_WAIT([ovsdb-server])
+
+as northd
+OVS_APP_EXIT_AND_WAIT([ovn-northd])
+
+as
+OVS_TRAFFIC_VSWITCHD_STOP(["/.*error receiving.*/d
+/failed to query port patch-.*/d
+/.*terminating with signal 15.*/d"])
+AT_CLEANUP
+])
-- 
2.39.3

_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to