On 3/22/23 08:34, Nobuhiro MIKI wrote:
> On 2023/03/22 8:15, Ilya Maximets wrote:
>> On 3/15/23 07:07, Nobuhiro MIKI wrote:
>>> SRv6 (Segment Routing IPv6) tunnel vport is responsible
>>> for encapsulation and decapsulation the inner packets with
>>> IPv6 header and an extended header called SRH
>>> (Segment Routing Header). See spec in:
>>>
>>> https://datatracker.ietf.org/doc/html/rfc8754
>>>
>>> This patch implements SRv6 tunneling in userspace datapath.
>>> It uses `remote_ip` and `local_ip` options as with existing
>>> tunnel protocols. It also adds a dedicated `srv6_segs` option
>>> to define a sequence of routers called segment list.
>>>
>>> Signed-off-by: Nobuhiro MIKI <nm...@yahoo-corp.jp>
>>> ---
>>>  Documentation/faq/configuration.rst |  21 ++++
>>>  Documentation/faq/releases.rst      |   1 +
>>>  NEWS                                |   2 +
>>>  include/linux/openvswitch.h         |   1 +
>>>  lib/dpif-netlink-rtnl.c             |   5 +
>>>  lib/dpif-netlink.c                  |   5 +
>>>  lib/netdev-native-tnl.c             | 145 ++++++++++++++++++++++++++++
>>>  lib/netdev-native-tnl.h             |  10 ++
>>>  lib/netdev-vport.c                  |  53 ++++++++++
>>>  lib/netdev.h                        |   4 +
>>>  lib/packets.h                       |  11 +++
>>>  lib/tnl-ports.c                     |   6 +-
>>>  ofproto/ofproto-dpif-xlate.c        |   3 +
>>>  tests/system-kmod-macros.at         |   8 ++
>>>  tests/system-traffic.at             | 119 +++++++++++++++++++++++
>>>  tests/system-userspace-macros.at    |   6 ++
>>>  tests/tunnel.at                     |  56 +++++++++++
>>>  17 files changed, 455 insertions(+), 1 deletion(-)
>>
>> Thanks for the new version!  The code looks good in general.
>> See some small comments inline.
>>
> 
> Thanks for your review!
> I have written reply inline.
> 
> Best Regards,
> Nobuhiro Miki
> 
>>> +static void
>>> +srv6_build_header(struct ovs_action_push_tnl *data,
>>> +                  const struct netdev_tnl_build_header_params *params,
>>> +                  int nr_segs, const struct in6_addr *segs)
>>> +{
>>> +    struct ovs_16aligned_ip6_hdr *nh6;
>>> +    struct srv6_base_hdr *srh;
>>> +    struct in6_addr *s;
>>> +    unsigned int hlen;
>>> +    ovs_be16 dl_type;
>>> +    int i;
>>> +
>>> +    ovs_assert(nr_segs > 0);
>>> +
>>> +    nh6 = (struct ovs_16aligned_ip6_hdr *) eth_build_header(data, params);
>>> +    put_16aligned_be32(&nh6->ip6_flow, htonl(6 << 28) |
>>> +                       htonl(params->flow->tunnel.ip_tos << 20));
>>> +    nh6->ip6_hlim = params->flow->tunnel.ip_ttl;
>>> +    nh6->ip6_nxt = IPPROTO_ROUTING;
>>> +    memcpy(&nh6->ip6_src, params->s_ip, sizeof(ovs_be32[4]));
>>> +    memcpy(&nh6->ip6_dst, &segs[0], sizeof(ovs_be32[4]));
>>
>> We should probably use netdev_tnl_ip_build_header() here.
>> See below...
>>
> 
> Indeed it is. I'll fix it.
> 
>>> +
>>> +    srh = (struct srv6_base_hdr *) (nh6 + 1);
>>> +    dl_type = params->flow->dl_type;
>>> +    if (dl_type == htons(ETH_TYPE_IP)) {
>>> +        srh->rt_hdr.nexthdr = IPPROTO_IPIP;
>>> +    } else if (dl_type == htons(ETH_TYPE_IPV6)) {
>>> +        srh->rt_hdr.nexthdr = IPPROTO_IPV6;
>>> +    }
>>> +    srh->rt_hdr.type = IPV6_SRCRT_TYPE_4;
>>> +    srh->rt_hdr.hdrlen = 2 * nr_segs;
>>> +    srh->rt_hdr.segments_left = nr_segs - 1;
>>> +    srh->last_entry = nr_segs - 1;
>>> +    srh->flags = 0;
>>> +    srh->tag = 0;
>>> +
>>> +    s = ALIGNED_CAST(struct in6_addr *,
>>> +                     (char *) srh + sizeof(struct srv6_base_hdr));
>>> +    for (i = 0; i < nr_segs; i++) {
>>> +        /* Segment list is written to the header in reverse order. */
>>> +        memcpy(s, &segs[nr_segs - i - 1], sizeof(ovs_be32[4]));
>>
>> It should be sizeof *s instead.  We should avoid using type sizes
>> as per the coding style. (I know netdev_tnl_ip_build_header is not
>> following that, unfortunately.)
>>
> 
> OK. I'll look at similar code together.
> 
>>> +        s++;
>>> +    }
>>> +
>>> +    hlen = IPV6_HEADER_LEN + sizeof(struct srv6_base_hdr) +
>>> +           8 * srh->rt_hdr.hdrlen;
>>> +
>>> +    data->header_len += hlen;
>>> +    data->tnl_type = OVS_VPORT_TYPE_SRV6;
>>> +}
>>> +
>>> +int
>>> +netdev_srv6_build_header(const struct netdev *netdev,
>>> +                         struct ovs_action_push_tnl *data,
>>> +                         const struct netdev_tnl_build_header_params 
>>> *params)
>>> +{
>>> +    struct netdev_vport *dev = netdev_vport_cast(netdev);
>>> +    struct netdev_tunnel_config *tnl_cfg;
>>> +
>>> +    ovs_mutex_lock(&dev->mutex);
>>> +    tnl_cfg = &dev->tnl_cfg;
>>> +
>>> +    if (tnl_cfg->srv6_num_segs) {
>>> +        srv6_build_header(data, params,
>>> +                          tnl_cfg->srv6_num_segs, tnl_cfg->srv6_segs);
>>> +    } else {
>>> +        /*
>>> +         * If explicit segment list setting is omitted, tunnel destination
>>> +         * is considered to be the first segment list.
>>> +         */
>>> +        srv6_build_header(data, params,
>>> +                          1, &params->flow->tunnel.ipv6_dst);
>>> +    }
>>
>> Isn't it a misconfig if segs[0] != params->flow->tunnel.ipv6_dst ?
>> I mean, we shouldn't be sending a packet anywhere other than the
>> first address in the segment list, right?
>>
> 
> Right.
> We need to be sure that segs[0] == params->flow->tunnel.ipv6_dst.
> 
>> I thinking that maybe we can check that tunnel.ipv6_dst equals seg[0]
>> and fail the header build if it's not the case.  Then we can simply
>> use a generic function netdev_tnl_ip_build_header() to build a base
>> IPv6 header for us.  Then only fill in SRv6 specific fields.
>>
>> What do you think?
>>
> 
> Thanks. It seems like a simpler and better approach.
> Less conditional branching and also the intended behavior.
> I imagine the implementation to be as follows:
>  
> 1. if num_segs = 0 (i.e., the user did not specify a segment list),
>    set segs[0] = tunnel.ipv6_dst and num_segs = 1.
> 
> 2. if tunnel.ipv6_dst != segs[0], we make header build fail.
> 
> 3. call netdev_tnl_ip_build_header().
> 
> 4. fill segment routing header.
> 
> How about this approach?

Sounds good.  Thanks!

Best regards, Ilya Maximets.
_______________________________________________
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to