Re: [ovs-discuss] gso packet is failing with af_packet socket with packet_vnet_hdr

2019-11-07 Thread Flavio Leitner
On Wed, 6 Nov 2019 01:59:41 +0530
Ramana Reddy  wrote:

> Hi Flavio,
> As per your inputs, I modified the gso_size, and now
> skb_gso_validate_mtu(skb, mtu) is returning true, and
> ip_finish_output2(sk, skb)  and dst_neigh_output(dst, neigh, skb); are
> getting called. But still, I am seeing the large packets getting
> dropped somewhere in the kernel
> down the line and retransmission happening.

The gso_size is the size of the data payload, so it doesn't account the
headers. Usually this depends on the iface MTU like I pointed before and
that MTU should account for the encapsulation later on. For example:

veth0(1450)veth1(1450) -- VXLAN(64k) --- eth0(1500)

Perhaps you can use ``dropwatch´´ to find out where the packet is
dropped.

fbl
___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


Re: [ovs-discuss] gso packet is failing with af_packet socket with packet_vnet_hdr

2019-11-05 Thread Ramana Reddy
Hi Flavio,
As per your inputs, I modified the gso_size, and now
skb_gso_validate_mtu(skb, mtu) is returning true, and
ip_finish_output2(sk, skb)  and dst_neigh_output(dst, neigh, skb); are
getting called. But still, I am seeing the large packets getting dropped
somewhere in the kernel
down the line and retransmission happening.

if (skb_gso_validate_mtu(skb, mtu))
 return ip_finish_output2(sk, skb);

[ 1854.905733] vxlan_xmit:2262 skb->len:2776 packet_length:2762
[ 1854.905744] skb_gso_size_check:4478 and seg_len:1500 and max_len:1500
and shinfo->gso_size:1398 and GSO_BY_FRAGS:65535

The gso_size 1398 bytes is correct in my case ( 1398 + 50 (vxlan header) +
20(IP) + TCP(32) + 14(ETH) = 1514 bytes)
The code is simple:
  vnet = buf;  // buf is an array of 64k bytes
  len = 0;
if (csum) {
vnet->flags = (VIRTIO_NET_HDR_F_NEEDS_CSUM);


 vnet->csum_start = (ETH_HLEN + sizeof(*iph));
vnet->csum_offset = (__builtin_offsetof(struct
tcphdr, check));
   }



   if (gso) {
vnet->hdr_len = (ETH_HLEN + sizeof(*iph) +
sizeof(*tcph));
vnet->gso_type = VIRTIO_NET_HDR_GSO_TCPV4;


vnet->gso_size = ( ETH_DATA_LEN  - 50 -
sizeof(struct iphdr) -
sizeof(struct
tcphdr));  // 50 is the vxlan header
} else {
vnet->gso_type = VIRTIO_NET_HDR_GSO_NONE;
 vnet->gso_size =  0;
}
 len =sizeof(*vnet);


 // Now copying the entire L2  packet into the buf starting at
an offset buf + len and sending the packet.

 Did I miss something? And I am not sure how OVS behaves after receiving
this packet and before transmitting to vxlan.
How is checksum offloading happening with af_packet in OVS?  Does OVS have
any role in this?


Please see the attached image for reference. The packet flow with in the
host is given below:

Ubuntu container (eth0 (1500MTU))--routing
lookup-->Ubuntu container(veth0(1450 MTU))
->OVS(veth1(1450MTU))->vxlan(65K MTU)->eth0(physical
interface(1500MTU))->other machine.

Looking forward to your reply.
Regards,
Ramana


On Mon, Nov 4, 2019 at 10:41 PM Ramana Reddy  wrote:

> Thanks, Flavio. I will check it out tomorrow and let you know how it goes.
>
> Regards,
> Ramana
>
>
> On Mon, Nov 4, 2019 at 10:15 PM Flavio Leitner  wrote:
>
>> On Mon, 4 Nov 2019 21:32:28 +0530
>> Ramana Reddy  wrote:
>>
>> > Hi Favio Leitner,
>> > Thank you very much for your reply. Here is the code snippet. But the
>> > same code is working if I send the packet without ovs.
>>
>> Could you provide more details on the OvS environment and the test?
>>
>> The linux kernel propagates the header size dependencies when you stack
>> the devices in net_device->hard_header_len, so in the case of vxlan dev
>> it will be:
>>
>> needed_headroom = lowerdev->hard_header_len;
>> needed_headroom += VXLAN_HEADROOM;
>> dev->needed_headroom = needed_headroom;
>>
>> Sounds like that is helping when OvS is not being used.
>>
>> fbl
>>
>>
>> > bool csum = true;
>> > bool gso = true'
>> >  struct virtio_net_hdr *vnet = buf;
>> >if (csum) {
>> > vnet->flags = (VIRTIO_NET_HDR_F_NEEDS_CSUM);
>> > vnet->csum_start = ETH_HLEN + sizeof(*iph);
>> > vnet->csum_offset = __builtin_offsetof(struct
>> > tcphdr, check);
>> > }
>> >
>> > if (gso) {
>> > vnet->hdr_len = ETH_HLEN + sizeof(*iph) +
>> > sizeof(*tcph);
>> > vnet->gso_type = VIRTIO_NET_HDR_GSO_TCPV4;
>> > vnet->gso_size = ETH_DATA_LEN - sizeof(struct
>> > iphdr) -
>> > sizeof(struct
>> > tcphdr);
>> > } else {
>> > vnet->gso_type = VIRTIO_NET_HDR_GSO_NONE;
>> > }
>> > Regards,
>> > Ramana
>> >
>> >
>> > On Mon, Nov 4, 2019 at 8:39 PM Flavio Leitner 
>> > wrote:
>> >
>> > >
>> > > Hi,
>> > >
>> > > What's the value you're passing on gso_size in struct
>> > > virtio_net_hdr? You need to leave room for the encapsulation
>> > > header, e.g.:
>> > >
>> > > gso_size = iface_mtu - virtio_net_hdr->hdr_len
>> > >
>> > > fbl
>> > >
>> > > On Mon, 4 Nov 2019 01:11:36 +0530
>> > > Ramana Reddy  wrote:
>> > >
>> > > > Hi,
>> > > > I am wondering if anyone can help me with this. I am having
>> > > > trouble to send tso/gso packet
>> > > > with af_packet socket with packet_vnet_hdr (through
>> > > > virtio_net_hdr) over vxlan tunnel in OVS.
>> > > >
>> > > > What I observed that, the following function eventually hitting
>> > > > and is returning false (net/core/skbuff.c), hence the packet is
>> > > > dropping. static inline bool skb_gso_size_check(const struct

Re: [ovs-discuss] gso packet is failing with af_packet socket with packet_vnet_hdr

2019-11-04 Thread Ramana Reddy
Thanks, Flavio. I will check it out tomorrow and let you know how it goes.

Regards,
Ramana


On Mon, Nov 4, 2019 at 10:15 PM Flavio Leitner  wrote:

> On Mon, 4 Nov 2019 21:32:28 +0530
> Ramana Reddy  wrote:
>
> > Hi Favio Leitner,
> > Thank you very much for your reply. Here is the code snippet. But the
> > same code is working if I send the packet without ovs.
>
> Could you provide more details on the OvS environment and the test?
>
> The linux kernel propagates the header size dependencies when you stack
> the devices in net_device->hard_header_len, so in the case of vxlan dev
> it will be:
>
> needed_headroom = lowerdev->hard_header_len;
> needed_headroom += VXLAN_HEADROOM;
> dev->needed_headroom = needed_headroom;
>
> Sounds like that is helping when OvS is not being used.
>
> fbl
>
>
> > bool csum = true;
> > bool gso = true'
> >  struct virtio_net_hdr *vnet = buf;
> >if (csum) {
> > vnet->flags = (VIRTIO_NET_HDR_F_NEEDS_CSUM);
> > vnet->csum_start = ETH_HLEN + sizeof(*iph);
> > vnet->csum_offset = __builtin_offsetof(struct
> > tcphdr, check);
> > }
> >
> > if (gso) {
> > vnet->hdr_len = ETH_HLEN + sizeof(*iph) +
> > sizeof(*tcph);
> > vnet->gso_type = VIRTIO_NET_HDR_GSO_TCPV4;
> > vnet->gso_size = ETH_DATA_LEN - sizeof(struct
> > iphdr) -
> > sizeof(struct
> > tcphdr);
> > } else {
> > vnet->gso_type = VIRTIO_NET_HDR_GSO_NONE;
> > }
> > Regards,
> > Ramana
> >
> >
> > On Mon, Nov 4, 2019 at 8:39 PM Flavio Leitner 
> > wrote:
> >
> > >
> > > Hi,
> > >
> > > What's the value you're passing on gso_size in struct
> > > virtio_net_hdr? You need to leave room for the encapsulation
> > > header, e.g.:
> > >
> > > gso_size = iface_mtu - virtio_net_hdr->hdr_len
> > >
> > > fbl
> > >
> > > On Mon, 4 Nov 2019 01:11:36 +0530
> > > Ramana Reddy  wrote:
> > >
> > > > Hi,
> > > > I am wondering if anyone can help me with this. I am having
> > > > trouble to send tso/gso packet
> > > > with af_packet socket with packet_vnet_hdr (through
> > > > virtio_net_hdr) over vxlan tunnel in OVS.
> > > >
> > > > What I observed that, the following function eventually hitting
> > > > and is returning false (net/core/skbuff.c), hence the packet is
> > > > dropping. static inline bool skb_gso_size_check(const struct
> > > > sk_buff *skb, unsigned int seg_len,
> > > >   unsigned int max_len) {
> > > > const struct skb_shared_info *shinfo = skb_shinfo(skb);
> > > > const struct sk_buff *iter;
> > > > if (shinfo->gso_size != GSO_BY_FRAGS)
> > > > return seg_len <= max_len;
> > > > ..
> > > > }
> > > > [  678.756673] ip_finish_output_gso:235 packet_length:2762 (here
> > > > packet_length = skb->len - skb_inner_network_offset(skb))
> > > > [  678.756678] ip_fragment:510 packet length:1500
> > > > [  678.756715] ip_fragment:510 packet length:1314
> > > > [  678.956889] skb_gso_size_check:4474 and seg_len:1550 and
> > > > max_len:1500 and shinfo->gso_size:1448 and GSO_BY_FRAGS:65535
> > > >
> > > > Observation:
> > > > When we send the large packet ( example here is
> > > > packet_length:2762), its showing the seg_len(1550) >
> > > > max_len(1500). Hence return seg_len <= max_len statement
> > > > returning false. Because of this, ip_fragment calling
> > > > icmp_send(skb, ICMP_DEST_UNREACH, ICMP_FRAG_NEEDED, htonl(mtu));
> > > > rather the code reaching to ip_finish_output2(sk, skb)
> > > > function in net/ipv4/ip_output.c and is given below:
> > > >
> > > > static int ip_finish_output_gso(struct sock *sk, struct sk_buff
> > > > *skb, unsigned int mtu)
> > > > {
> > > > netdev_features_t features;
> > > > struct sk_buff *segs;
> > > > int ret = 0;
> > > >
> > > > /* common case: seglen is <= mtu */
> > > > if (skb_gso_validate_mtu(skb, mtu))
> > > > return ip_finish_output2(sk, skb);
> > > >...
> > > >   err = ip_fragment(sk, segs, mtu, ip_finish_output2);
> > > >   ...
> > > >  }
> > > >
> > > > But when we send normal iperf traffic ( gso/tso  traffic) over
> > > > vxlan, the skb_gso_size_check returning a true value, and
> > > > ip_finish_output2 getting executed.
> > > > Here is the values of normal iperf traffic over vxlan.
> > > >
> > > > [ 1041.400537] skb_gso_size_check:4477 and seg_len:1500 and
> > > > max_len:1500 and shinfo->gso_size:1398 and GSO_BY_FRAGS:65535
> > > > [ 1041.400587] skb_gso_size_check:4477 and seg_len:1450 and
> > > > max_len:1450 and shinfo->gso_size:1398 and GSO_BY_FRAGS:65535
> > > > [ 1041.400594] skb_gso_size_check:4477 and seg_len:1500 and
> > > > max_len:1500 and shinfo->gso_size:1398 and GSO_BY_FRAGS:65535
> > > > [ 1041.40

Re: [ovs-discuss] gso packet is failing with af_packet socket with packet_vnet_hdr

2019-11-04 Thread Flavio Leitner
On Mon, 4 Nov 2019 21:32:28 +0530
Ramana Reddy  wrote:

> Hi Favio Leitner,
> Thank you very much for your reply. Here is the code snippet. But the
> same code is working if I send the packet without ovs.

Could you provide more details on the OvS environment and the test?

The linux kernel propagates the header size dependencies when you stack
the devices in net_device->hard_header_len, so in the case of vxlan dev
it will be:

needed_headroom = lowerdev->hard_header_len;
needed_headroom += VXLAN_HEADROOM;
dev->needed_headroom = needed_headroom;

Sounds like that is helping when OvS is not being used.

fbl


> bool csum = true;
> bool gso = true'
>  struct virtio_net_hdr *vnet = buf;
>if (csum) {
> vnet->flags = (VIRTIO_NET_HDR_F_NEEDS_CSUM);
> vnet->csum_start = ETH_HLEN + sizeof(*iph);
> vnet->csum_offset = __builtin_offsetof(struct
> tcphdr, check);
> }
> 
> if (gso) {
> vnet->hdr_len = ETH_HLEN + sizeof(*iph) +
> sizeof(*tcph);
> vnet->gso_type = VIRTIO_NET_HDR_GSO_TCPV4;
> vnet->gso_size = ETH_DATA_LEN - sizeof(struct
> iphdr) -
> sizeof(struct
> tcphdr);
> } else {
> vnet->gso_type = VIRTIO_NET_HDR_GSO_NONE;
> }
> Regards,
> Ramana
> 
> 
> On Mon, Nov 4, 2019 at 8:39 PM Flavio Leitner 
> wrote:
> 
> >
> > Hi,
> >
> > What's the value you're passing on gso_size in struct
> > virtio_net_hdr? You need to leave room for the encapsulation
> > header, e.g.:
> >
> > gso_size = iface_mtu - virtio_net_hdr->hdr_len
> >
> > fbl
> >
> > On Mon, 4 Nov 2019 01:11:36 +0530
> > Ramana Reddy  wrote:
> >  
> > > Hi,
> > > I am wondering if anyone can help me with this. I am having
> > > trouble to send tso/gso packet
> > > with af_packet socket with packet_vnet_hdr (through
> > > virtio_net_hdr) over vxlan tunnel in OVS.
> > >
> > > What I observed that, the following function eventually hitting
> > > and is returning false (net/core/skbuff.c), hence the packet is
> > > dropping. static inline bool skb_gso_size_check(const struct
> > > sk_buff *skb, unsigned int seg_len,
> > >   unsigned int max_len) {
> > > const struct skb_shared_info *shinfo = skb_shinfo(skb);
> > > const struct sk_buff *iter;
> > > if (shinfo->gso_size != GSO_BY_FRAGS)
> > > return seg_len <= max_len;
> > > ..
> > > }
> > > [  678.756673] ip_finish_output_gso:235 packet_length:2762 (here
> > > packet_length = skb->len - skb_inner_network_offset(skb))
> > > [  678.756678] ip_fragment:510 packet length:1500
> > > [  678.756715] ip_fragment:510 packet length:1314
> > > [  678.956889] skb_gso_size_check:4474 and seg_len:1550 and
> > > max_len:1500 and shinfo->gso_size:1448 and GSO_BY_FRAGS:65535
> > >
> > > Observation:
> > > When we send the large packet ( example here is
> > > packet_length:2762), its showing the seg_len(1550) >
> > > max_len(1500). Hence return seg_len <= max_len statement
> > > returning false. Because of this, ip_fragment calling
> > > icmp_send(skb, ICMP_DEST_UNREACH, ICMP_FRAG_NEEDED, htonl(mtu));
> > > rather the code reaching to ip_finish_output2(sk, skb)
> > > function in net/ipv4/ip_output.c and is given below:
> > >
> > > static int ip_finish_output_gso(struct sock *sk, struct sk_buff
> > > *skb, unsigned int mtu)
> > > {
> > > netdev_features_t features;
> > > struct sk_buff *segs;
> > > int ret = 0;
> > >
> > > /* common case: seglen is <= mtu */
> > > if (skb_gso_validate_mtu(skb, mtu))
> > > return ip_finish_output2(sk, skb);
> > >...
> > >   err = ip_fragment(sk, segs, mtu, ip_finish_output2);
> > >   ...
> > >  }
> > >
> > > But when we send normal iperf traffic ( gso/tso  traffic) over
> > > vxlan, the skb_gso_size_check returning a true value, and
> > > ip_finish_output2 getting executed.
> > > Here is the values of normal iperf traffic over vxlan.
> > >
> > > [ 1041.400537] skb_gso_size_check:4477 and seg_len:1500 and
> > > max_len:1500 and shinfo->gso_size:1398 and GSO_BY_FRAGS:65535
> > > [ 1041.400587] skb_gso_size_check:4477 and seg_len:1450 and
> > > max_len:1450 and shinfo->gso_size:1398 and GSO_BY_FRAGS:65535
> > > [ 1041.400594] skb_gso_size_check:4477 and seg_len:1500 and
> > > max_len:1500 and shinfo->gso_size:1398 and GSO_BY_FRAGS:65535
> > > [ 1041.400732] skb_gso_size_check:4477 and seg_len:1450 and
> > > max_len:1450 and shinfo->gso_size:1398 and GSO_BY_FRAGS:65535
> > > [ 1041.400741] skb_gso_size_check:4477 and seg_len:1450 and
> > > max_len:1450 and shinfo->gso_size:1398 and GSO_BY_FRAGS:65535
> > >
> > > Can someone help me to solve what is missing, and where should I
> > > modify the code in OVS/ or outside of ovs, so th

Re: [ovs-discuss] gso packet is failing with af_packet socket with packet_vnet_hdr

2019-11-04 Thread Ramana Reddy
Hi Favio Leitner,
Thank you very much for your reply. Here is the code snippet. But the same
code is working if I send the packet without ovs.
bool csum = true;
bool gso = true'
 struct virtio_net_hdr *vnet = buf;
   if (csum) {
vnet->flags = (VIRTIO_NET_HDR_F_NEEDS_CSUM);
vnet->csum_start = ETH_HLEN + sizeof(*iph);
vnet->csum_offset = __builtin_offsetof(struct
tcphdr, check);
}

if (gso) {
vnet->hdr_len = ETH_HLEN + sizeof(*iph) +
sizeof(*tcph);
vnet->gso_type = VIRTIO_NET_HDR_GSO_TCPV4;
vnet->gso_size = ETH_DATA_LEN - sizeof(struct
iphdr) -
sizeof(struct
tcphdr);
} else {
vnet->gso_type = VIRTIO_NET_HDR_GSO_NONE;
}
Regards,
Ramana


On Mon, Nov 4, 2019 at 8:39 PM Flavio Leitner  wrote:

>
> Hi,
>
> What's the value you're passing on gso_size in struct virtio_net_hdr?
> You need to leave room for the encapsulation header, e.g.:
>
> gso_size = iface_mtu - virtio_net_hdr->hdr_len
>
> fbl
>
> On Mon, 4 Nov 2019 01:11:36 +0530
> Ramana Reddy  wrote:
>
> > Hi,
> > I am wondering if anyone can help me with this. I am having trouble
> > to send tso/gso packet
> > with af_packet socket with packet_vnet_hdr (through virtio_net_hdr)
> > over vxlan tunnel in OVS.
> >
> > What I observed that, the following function eventually hitting and is
> > returning false (net/core/skbuff.c), hence the packet is dropping.
> > static inline bool skb_gso_size_check(const struct sk_buff *skb,
> >   unsigned int seg_len,
> >   unsigned int max_len) {
> > const struct skb_shared_info *shinfo = skb_shinfo(skb);
> > const struct sk_buff *iter;
> > if (shinfo->gso_size != GSO_BY_FRAGS)
> > return seg_len <= max_len;
> > ..
> > }
> > [  678.756673] ip_finish_output_gso:235 packet_length:2762 (here
> > packet_length = skb->len - skb_inner_network_offset(skb))
> > [  678.756678] ip_fragment:510 packet length:1500
> > [  678.756715] ip_fragment:510 packet length:1314
> > [  678.956889] skb_gso_size_check:4474 and seg_len:1550 and
> > max_len:1500 and shinfo->gso_size:1448 and GSO_BY_FRAGS:65535
> >
> > Observation:
> > When we send the large packet ( example here is packet_length:2762),
> > its showing the seg_len(1550) > max_len(1500). Hence return seg_len
> > <= max_len statement returning false.
> > Because of this, ip_fragment calling icmp_send(skb, ICMP_DEST_UNREACH,
> > ICMP_FRAG_NEEDED, htonl(mtu)); rather the code reaching to
> > ip_finish_output2(sk, skb)
> > function in net/ipv4/ip_output.c and is given below:
> >
> > static int ip_finish_output_gso(struct sock *sk, struct sk_buff *skb,
> > unsigned int mtu)
> > {
> > netdev_features_t features;
> > struct sk_buff *segs;
> > int ret = 0;
> >
> > /* common case: seglen is <= mtu */
> > if (skb_gso_validate_mtu(skb, mtu))
> > return ip_finish_output2(sk, skb);
> >...
> >   err = ip_fragment(sk, segs, mtu, ip_finish_output2);
> >   ...
> >  }
> >
> > But when we send normal iperf traffic ( gso/tso  traffic) over vxlan,
> > the skb_gso_size_check returning a true value, and ip_finish_output2
> > getting executed.
> > Here is the values of normal iperf traffic over vxlan.
> >
> > [ 1041.400537] skb_gso_size_check:4477 and seg_len:1500 and
> > max_len:1500 and shinfo->gso_size:1398 and GSO_BY_FRAGS:65535
> > [ 1041.400587] skb_gso_size_check:4477 and seg_len:1450 and
> > max_len:1450 and shinfo->gso_size:1398 and GSO_BY_FRAGS:65535
> > [ 1041.400594] skb_gso_size_check:4477 and seg_len:1500 and
> > max_len:1500 and shinfo->gso_size:1398 and GSO_BY_FRAGS:65535
> > [ 1041.400732] skb_gso_size_check:4477 and seg_len:1450 and
> > max_len:1450 and shinfo->gso_size:1398 and GSO_BY_FRAGS:65535
> > [ 1041.400741] skb_gso_size_check:4477 and seg_len:1450 and
> > max_len:1450 and shinfo->gso_size:1398 and GSO_BY_FRAGS:65535
> >
> > Can someone help me to solve what is missing, and where should I
> > modify the code in OVS/ or outside of ovs, so that it works as
> > expected.
> >
> > Thanks in advance.
> >
> > Some more info:
> > [root@xx ~]# uname -r
> > 3.10.0-1062.4.1.el7.x86_64
> > [root@xx ~]# cat /etc/redhat-release
> > Red Hat Enterprise Linux Server release 7.7 (Maipo)
> >
> > [root@xx]# ovs-vsctl --version
> > ovs-vsctl (Open vSwitch) 2.9.0
> > DB Schema 7.15.1
> >
> > And dump_stack output with af_packet:
> > [ 4833.637460][] dump_stack+0x19/0x1b
> > [ 4833.637474]  []
> > ip_fragment.constprop.55+0xc3/0x141 [ 4833.637481]
> > [] ip_finish_output+0x314/0x350 [ 4833.637484]
> > [] ip_output+0xb3/0x130 [ 4833.637490]
> > [] ? ip_do_fragment+0x910/

Re: [ovs-discuss] gso packet is failing with af_packet socket with packet_vnet_hdr

2019-11-04 Thread Flavio Leitner


Hi,

What's the value you're passing on gso_size in struct virtio_net_hdr?
You need to leave room for the encapsulation header, e.g.:

gso_size = iface_mtu - virtio_net_hdr->hdr_len

fbl

On Mon, 4 Nov 2019 01:11:36 +0530
Ramana Reddy  wrote:

> Hi,
> I am wondering if anyone can help me with this. I am having trouble
> to send tso/gso packet
> with af_packet socket with packet_vnet_hdr (through virtio_net_hdr)
> over vxlan tunnel in OVS.
> 
> What I observed that, the following function eventually hitting and is
> returning false (net/core/skbuff.c), hence the packet is dropping.
> static inline bool skb_gso_size_check(const struct sk_buff *skb,
>   unsigned int seg_len,
>   unsigned int max_len) {
> const struct skb_shared_info *shinfo = skb_shinfo(skb);
> const struct sk_buff *iter;
> if (shinfo->gso_size != GSO_BY_FRAGS)
> return seg_len <= max_len;
> ..
> }
> [  678.756673] ip_finish_output_gso:235 packet_length:2762 (here
> packet_length = skb->len - skb_inner_network_offset(skb))
> [  678.756678] ip_fragment:510 packet length:1500
> [  678.756715] ip_fragment:510 packet length:1314
> [  678.956889] skb_gso_size_check:4474 and seg_len:1550 and
> max_len:1500 and shinfo->gso_size:1448 and GSO_BY_FRAGS:65535
> 
> Observation:
> When we send the large packet ( example here is packet_length:2762),
> its showing the seg_len(1550) > max_len(1500). Hence return seg_len
> <= max_len statement returning false.
> Because of this, ip_fragment calling icmp_send(skb, ICMP_DEST_UNREACH,
> ICMP_FRAG_NEEDED, htonl(mtu)); rather the code reaching to
> ip_finish_output2(sk, skb)
> function in net/ipv4/ip_output.c and is given below:
> 
> static int ip_finish_output_gso(struct sock *sk, struct sk_buff *skb,
> unsigned int mtu)
> {
> netdev_features_t features;
> struct sk_buff *segs;
> int ret = 0;
> 
> /* common case: seglen is <= mtu */
> if (skb_gso_validate_mtu(skb, mtu))
> return ip_finish_output2(sk, skb);
>...
>   err = ip_fragment(sk, segs, mtu, ip_finish_output2);
>   ...
>  }
> 
> But when we send normal iperf traffic ( gso/tso  traffic) over vxlan,
> the skb_gso_size_check returning a true value, and ip_finish_output2
> getting executed.
> Here is the values of normal iperf traffic over vxlan.
> 
> [ 1041.400537] skb_gso_size_check:4477 and seg_len:1500 and
> max_len:1500 and shinfo->gso_size:1398 and GSO_BY_FRAGS:65535
> [ 1041.400587] skb_gso_size_check:4477 and seg_len:1450 and
> max_len:1450 and shinfo->gso_size:1398 and GSO_BY_FRAGS:65535
> [ 1041.400594] skb_gso_size_check:4477 and seg_len:1500 and
> max_len:1500 and shinfo->gso_size:1398 and GSO_BY_FRAGS:65535
> [ 1041.400732] skb_gso_size_check:4477 and seg_len:1450 and
> max_len:1450 and shinfo->gso_size:1398 and GSO_BY_FRAGS:65535
> [ 1041.400741] skb_gso_size_check:4477 and seg_len:1450 and
> max_len:1450 and shinfo->gso_size:1398 and GSO_BY_FRAGS:65535
> 
> Can someone help me to solve what is missing, and where should I
> modify the code in OVS/ or outside of ovs, so that it works as
> expected.
> 
> Thanks in advance.
> 
> Some more info:
> [root@xx ~]# uname -r
> 3.10.0-1062.4.1.el7.x86_64
> [root@xx ~]# cat /etc/redhat-release
> Red Hat Enterprise Linux Server release 7.7 (Maipo)
> 
> [root@xx]# ovs-vsctl --version
> ovs-vsctl (Open vSwitch) 2.9.0
> DB Schema 7.15.1
> 
> And dump_stack output with af_packet:
> [ 4833.637460][] dump_stack+0x19/0x1b
> [ 4833.637474]  []
> ip_fragment.constprop.55+0xc3/0x141 [ 4833.637481]
> [] ip_finish_output+0x314/0x350 [ 4833.637484]
> [] ip_output+0xb3/0x130 [ 4833.637490]
> [] ? ip_do_fragment+0x910/0x910 [ 4833.637493]
> [] ip_local_out_sk+0xf9/0x180 [ 4833.637497]
> [] iptunnel_xmit+0x18c/0x220 [ 4833.637505]
> [] udp_tunnel_xmit_skb+0x117/0x130 [udp_tunnel]
> [ 4833.637538]  [] vxlan_xmit_one+0xb6a/0xb70
> [vxlan] [ 4833.637545]  [] ?
> vprintk_default+0x29/0x40 [ 4833.637551]  []
> vxlan_xmit+0xc9e/0xef0 [vxlan] [ 4833.637555]  [] ?
> kfree_skbmem+0x37/0x90 [ 4833.637559]  [] ?
> consume_skb+0x34/0x90 [ 4833.637564]  [] ?
> packet_rcv+0x4c/0x3e0 [ 4833.637570]  []
> dev_hard_start_xmit+0x246/0x3b0 [ 4833.637574]  []
> __dev_queue_xmit+0x519/0x650 [ 4833.637580]  [] ?
> try_to_wake_up+0x190/0x390 [ 4833.637585]  []
> dev_queue_xmit+0x10/0x20 [ 4833.637592]  []
> ovs_vport_send+0xa6/0x180 [openvswitch] [ 4833.637599]
> [] do_output+0x4e/0xd0 [openvswitch] [ 4833.637604]
>  [] do_execute_actions+0xa29/0xa40 [openvswitch]
> [ 4833.637610]  [] ? __wake_up_common+0x82/0x120
> [ 4833.637615]  [] ovs_execute_actions+0x4c/0x140
> [openvswitch]
> [ 4833.637621]  [] ovs_dp_process_packet+0x84/0x120
> [openvswitch]
> [ 4833.637627]  [] ? ovs_ct_update_key+0xc4/0x150
> [openvswitch]
> [ 4833.637633]  [] ovs_vport_receive+0x73/0xd0
> [openvswitch]
> [ 4833.637638]  [

[ovs-discuss] gso packet is failing with af_packet socket with packet_vnet_hdr

2019-11-03 Thread Ramana Reddy
Hi,
I am wondering if anyone can help me with this. I am having trouble to send
tso/gso packet
with af_packet socket with packet_vnet_hdr (through virtio_net_hdr) over
vxlan tunnel in OVS.

What I observed that, the following function eventually hitting and is
returning false (net/core/skbuff.c), hence the packet is dropping.
static inline bool skb_gso_size_check(const struct sk_buff *skb,
  unsigned int seg_len,
  unsigned int max_len) {
const struct skb_shared_info *shinfo = skb_shinfo(skb);
const struct sk_buff *iter;
if (shinfo->gso_size != GSO_BY_FRAGS)
return seg_len <= max_len;
..
}
[  678.756673] ip_finish_output_gso:235 packet_length:2762 (here
packet_length = skb->len - skb_inner_network_offset(skb))
[  678.756678] ip_fragment:510 packet length:1500
[  678.756715] ip_fragment:510 packet length:1314
[  678.956889] skb_gso_size_check:4474 and seg_len:1550 and max_len:1500
and shinfo->gso_size:1448 and GSO_BY_FRAGS:65535

Observation:
When we send the large packet ( example here is packet_length:2762), its
showing the seg_len(1550) > max_len(1500). Hence return seg_len <= max_len
statement returning false.
Because of this, ip_fragment calling icmp_send(skb, ICMP_DEST_UNREACH,
ICMP_FRAG_NEEDED, htonl(mtu)); rather the code reaching to
ip_finish_output2(sk, skb)
function in net/ipv4/ip_output.c and is given below:

static int ip_finish_output_gso(struct sock *sk, struct sk_buff *skb,
unsigned int mtu)
{
netdev_features_t features;
struct sk_buff *segs;
int ret = 0;

/* common case: seglen is <= mtu */
if (skb_gso_validate_mtu(skb, mtu))
return ip_finish_output2(sk, skb);
   ...
  err = ip_fragment(sk, segs, mtu, ip_finish_output2);
  ...
 }

But when we send normal iperf traffic ( gso/tso  traffic) over vxlan, the
skb_gso_size_check returning a true value, and ip_finish_output2 getting
executed.
Here is the values of normal iperf traffic over vxlan.

[ 1041.400537] skb_gso_size_check:4477 and seg_len:1500 and max_len:1500
and shinfo->gso_size:1398 and GSO_BY_FRAGS:65535
[ 1041.400587] skb_gso_size_check:4477 and seg_len:1450 and max_len:1450
and shinfo->gso_size:1398 and GSO_BY_FRAGS:65535
[ 1041.400594] skb_gso_size_check:4477 and seg_len:1500 and max_len:1500
and shinfo->gso_size:1398 and GSO_BY_FRAGS:65535
[ 1041.400732] skb_gso_size_check:4477 and seg_len:1450 and max_len:1450
and shinfo->gso_size:1398 and GSO_BY_FRAGS:65535
[ 1041.400741] skb_gso_size_check:4477 and seg_len:1450 and max_len:1450
and shinfo->gso_size:1398 and GSO_BY_FRAGS:65535

Can someone help me to solve what is missing, and where should I modify the
code in OVS/ or outside of ovs, so that it works as expected.

Thanks in advance.

Some more info:
[root@xx ~]# uname -r
3.10.0-1062.4.1.el7.x86_64
[root@xx ~]# cat /etc/redhat-release
Red Hat Enterprise Linux Server release 7.7 (Maipo)

[root@xx]# ovs-vsctl --version
ovs-vsctl (Open vSwitch) 2.9.0
DB Schema 7.15.1

And dump_stack output with af_packet:
[ 4833.637460][] dump_stack+0x19/0x1b
[ 4833.637474]  [] ip_fragment.constprop.55+0xc3/0x141
[ 4833.637481]  [] ip_finish_output+0x314/0x350
[ 4833.637484]  [] ip_output+0xb3/0x130
[ 4833.637490]  [] ? ip_do_fragment+0x910/0x910
[ 4833.637493]  [] ip_local_out_sk+0xf9/0x180
[ 4833.637497]  [] iptunnel_xmit+0x18c/0x220
[ 4833.637505]  [] udp_tunnel_xmit_skb+0x117/0x130
[udp_tunnel]
[ 4833.637538]  [] vxlan_xmit_one+0xb6a/0xb70 [vxlan]
[ 4833.637545]  [] ? vprintk_default+0x29/0x40
[ 4833.637551]  [] vxlan_xmit+0xc9e/0xef0 [vxlan]
[ 4833.637555]  [] ? kfree_skbmem+0x37/0x90
[ 4833.637559]  [] ? consume_skb+0x34/0x90
[ 4833.637564]  [] ? packet_rcv+0x4c/0x3e0
[ 4833.637570]  [] dev_hard_start_xmit+0x246/0x3b0
[ 4833.637574]  [] __dev_queue_xmit+0x519/0x650
[ 4833.637580]  [] ? try_to_wake_up+0x190/0x390
[ 4833.637585]  [] dev_queue_xmit+0x10/0x20
[ 4833.637592]  [] ovs_vport_send+0xa6/0x180 [openvswitch]
[ 4833.637599]  [] do_output+0x4e/0xd0 [openvswitch]
[ 4833.637604]  [] do_execute_actions+0xa29/0xa40
[openvswitch]
[ 4833.637610]  [] ? __wake_up_common+0x82/0x120
[ 4833.637615]  [] ovs_execute_actions+0x4c/0x140
[openvswitch]
[ 4833.637621]  [] ovs_dp_process_packet+0x84/0x120
[openvswitch]
[ 4833.637627]  [] ? ovs_ct_update_key+0xc4/0x150
[openvswitch]
[ 4833.637633]  [] ovs_vport_receive+0x73/0xd0
[openvswitch]
[ 4833.637638]  [] ? ttwu_do_activate+0x6f/0x80
[ 4833.637642]  [] ? try_to_wake_up+0x190/0x390
[ 4833.637646]  [] ? default_wake_function+0x12/0x20
[ 4833.637651]  [] ? autoremove_wake_function+0x2b/0x40
[ 4833.637657]  [] ? __wake_up_common+0x82/0x120
[ 4833.637661]  [] ? update_cfs_shares+0xa9/0xf0
[ 4833.637665]  [] ? update_curr+0x86/0x1e0
[ 4833.637669]  [] ? __enqueue_entity+0x78/0x80
[ 4833.637677]  [] netdev_frame_hook+0xde/0x180
[openvswitch]
[ 4833.637682]  [] __netif_