This series fixes a non-optimal behavior with some multichassis ports.

Specifically,

- when a multichassis port belongs to a switch that also has a localnet
  port,
- because ingress and egress traffic for the port is funnelled through
  tunnels to guarantee delivery of packets to all chassis that host the
  port,
- because tunnels add overhead to each funnelled packet,
- leaving less space for actual packets,

... the effective MTU of the path for multichassis ports is reduced by
the size that takes the tunnel to wrap a packet around. (This depends on
the type of tunnel, also on IP version of the tunnel.)

When this happens, the port and its peers are not notified about the
reduced MTU for the path to the port, effectively blackholing some of
the larger packets.

This patch series makes OVN detect these scenarios and handle them as
follows:

- for all multichassis ports, 4 flows are added - for inport and
  outport matches - to detect "too-big" IP packets;
- for all "too-big" packets, 2 flows are added to produce a ICMP
  Fragmentation Needed (for IPv4) or Too Big (for IPv6) packet and
  return it to the offending sender.

The proposed implementation is isolated in ovn-controller. An
alternative would be to implement flows producing ICMP errors via the
existing Logical_Flow action icmp4_error, as is done for router ports.
In this case, the 4 flows that detect "too-big" packets would still have
to reside in ovn-controller because of limitations of the current OVN
port model, namely lack of `mtu` as an attribute of OVN logical port.

Caveats: this works for IP traffic only. The party receiving the ICMP
error should handle it by temporarily lowering the MTU used to reach the
port. Behavior may depend on protocol implementation of the offending
peer as well as its networking stack configuration.

This patch series was successfully tested with Linux kernel 6.0.8.

This patch series is designed to simplify backporting process. It is
split in logical pieces to simplify cherry-picking. I consider the
original behavior described at the start of the cover letter a bug and
worth a backport.

---
v1: - initial version.
v2: - more system test cases adjusted to new table numbers for egress
      pipeline;
    - switch from xxd to coreutils (tr, head) tools to avoid a new
      dependency;
    - adjusted a comment in the new test case to avoid "migrator"
      terminology that makes no sense outside live migration context -
      which is not the only potential use case for multichassis ports;
    - guard the new test case with HAVE_SCAPY;
    - nit: rewrapped some lines in patch 6/7 to avoid unnecessary line
      changes;
    - added a NEWS entry for patch 6/7 that implements the core of the
      feature;
    - rebased to latest main.
v3: - expose and reuse lib/actions.c:encode_[start|finish]_controller_op
      functions;
    - reuse ETH_HEADER_LEN, IP_HEADER_LEN, and IPV6_HEADER_LEN from
      ovs:lib/packets.h;
    - add a comment for REGBIT_PKT_LARGER that mentions that the
      register is also used in OFTABLE_OUTPUT_LARGE_PKT_DETECT table;
    - squash the patch that makes if-status-mgr track changes to
      interface mtu into the patch that introduces the mtu field in the
      manager;
    - doc: don't describe path discovery flows in the patch that
      introduces new tables (only in the patch that actually implements
      the flows);
    - nit: use MAX to shave off several lines of code;
    - nit: remove redundant if_status_mgr_iface_get_mtu declaration from
      a header file;
    - rebased to latest main;
    - updated acks based on latest review comments.
v4: - fix compilation issue in the middle of the series due to previous
      commit rearrangement.
v5: - introduce a new if-status-mgr input to pflow_output, then process
      OVS interface updates inside if-status-mgr handlers.
    - if-status-mgr: remove set_mtu public function, instead other
      modules should call to if_status_mgr_iface_update. This allows the
      callers to not care which particular fields if-status-mgr would
      like to persist from the OVS record, keeping concerns separated.
---

Ihar Hrachyshka (5):
  Track ip version of tunnel in chassis_tunnel struct
  Track interface MTU in if-status-mgr
  if-status: track interfaces for additional chassis
  Add new egress tables to accommodate for too-big packets handling
  Implement MTU Path Discovery for multichassis ports

 NEWS                        |   6 +
 controller/binding.c        |  48 +--
 controller/binding.h        |   4 +
 controller/if-status.c      |  48 ++-
 controller/if-status.h      |   8 +-
 controller/lflow.c          |   4 +-
 controller/lflow.h          |  49 +--
 controller/local_data.c     |   2 +
 controller/local_data.h     |   1 +
 controller/ovn-controller.c | 122 +++++++
 controller/ovsport.c        |   9 +
 controller/ovsport.h        |   2 +
 controller/physical.c       | 337 ++++++++++++++++++--
 controller/physical.h       |   2 +
 controller/pinctrl.c        |   8 +-
 include/ovn/actions.h       |   3 +
 lib/actions.c               |   4 +-
 lib/ovn-util.h              |   7 +
 northd/northd.c             |   2 +
 ovn-architecture.7.xml      |  76 ++---
 ovs                         |   2 +-
 tests/ovn-controller.at     | 174 +++++-----
 tests/ovn.at                | 611 +++++++++++++++++++++++++++---------
 tests/system-ovn-kmod.at    |   2 +-
 tests/system-ovn.at         |   8 +-
 25 files changed, 1180 insertions(+), 359 deletions(-)

-- 
2.38.1

_______________________________________________
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to