Re: [PATCH net] sit: Set skb->protocol properly in ipip6_tunnel_xmit()

2016-11-28 Thread David Miller
From: Eric Dumazet 
Date: Mon, 28 Nov 2016 10:17:23 -0800

> It appears that Eli changelog was not very precise, because his bug was
> because of XFRM being involved.
> 
> xfrm_output() -> xfrm_output_gso() -> skb_gso_segment()
> 
> So XFRM calls skb_gso_segment() before ip_output() or
> ip6_finish_output2() had a chance to change skb->protocol

I think if we do it in ip6_local_out and ip_local_out, then none of
these hacks will be necessary at all.


Re: [PATCH net] sit: Set skb->protocol properly in ipip6_tunnel_xmit()

2016-11-28 Thread David Miller
From: Eric Dumazet 
Date: Mon, 28 Nov 2016 10:32:15 -0800

> On Mon, 2016-11-28 at 10:17 -0800, Eric Dumazet wrote:
> 
>> I was referring to Stephen bug report.
>> 
>> It appears that Eli changelog was not very precise, because his bug was
>> because of XFRM being involved.
>> 
>> xfrm_output() -> xfrm_output_gso() -> skb_gso_segment()
>> 
>> So XFRM calls skb_gso_segment() before ip_output() or
>> ip6_finish_output2() had a chance to change skb->protocol
> 
> So maybe the real fix would be to set skb->protocol at the right place,
> before xfrm can be called, instead of chasing all skb producers ;)

And the key for this would be dst_output() invocations.


Re: [PATCH net] sit: Set skb->protocol properly in ipip6_tunnel_xmit()

2016-11-28 Thread Eric Dumazet
On Mon, 2016-11-28 at 10:17 -0800, Eric Dumazet wrote:

> I was referring to Stephen bug report.
> 
> It appears that Eli changelog was not very precise, because his bug was
> because of XFRM being involved.
> 
> xfrm_output() -> xfrm_output_gso() -> skb_gso_segment()
> 
> So XFRM calls skb_gso_segment() before ip_output() or
> ip6_finish_output2() had a chance to change skb->protocol

So maybe the real fix would be to set skb->protocol at the right place,
before xfrm can be called, instead of chasing all skb producers ;)







Re: [PATCH net] sit: Set skb->protocol properly in ipip6_tunnel_xmit()

2016-11-28 Thread Eric Dumazet
On Mon, 2016-11-28 at 13:01 -0500, David Miller wrote:

> Hmmm, ip6_finish_output2() also does a proper assignment of skb->protocol,
> therefore why was commit ae148b085876fa771d9ef2c05f85d4b4bf09ce0d
> ("ip6_tunnel: Update skb->protocol to ETH_P_IPV6 in ip6_tnl_xmit()")
> necessary at all?
I was referring to Stephen bug report.

It appears that Eli changelog was not very precise, because his bug was
because of XFRM being involved.

xfrm_output() -> xfrm_output_gso() -> skb_gso_segment()

So XFRM calls skb_gso_segment() before ip_output() or
ip6_finish_output2() had a chance to change skb->protocol





Re: [PATCH net] sit: Set skb->protocol properly in ipip6_tunnel_xmit()

2016-11-28 Thread David Miller
From: Eric Dumazet 
Date: Mon, 28 Nov 2016 09:34:27 -0800

> On Mon, 2016-11-28 at 08:53 -0800, Eric Dumazet wrote:
>> On Mon, 2016-11-28 at 11:47 -0500, David Miller wrote:
>> > From: Stephen Rothwell 
>> > Date: Sun, 27 Nov 2016 13:04:00 +1100
>> > 
>> > > [Just for Dave's information]
>> > > 
>> > > On Fri, 25 Nov 2016 13:50:17 +0800 Eli Cooper  wrote:
>> > >>
>> > >> Similar to commit ae148b085876
>> > >> ("ip6_tunnel: Update skb->protocol to ETH_P_IPV6 in ip6_tnl_xmit()"),
>> > >> sit tunnels also need to update skb->protocol; otherwise, TSO/GSO 
>> > >> packets
>> > >> might not be properly segmented, which causes the packets being dropped.
>> > >> 
>> > >> Reported-by: Stephen Rothwell 
>> > >> Tested-by: Eli Cooper 
>> > >> Cc: sta...@vger.kernel.org
>> > >> Signed-off-by: Eli Cooper 
>> > > 
>> > > I tested this patch and it does *not* solve my problem.
>> > 
>> > I'm torn on this patch, because it looked exactly like it would solve the
>> > kind of problem Stephen is running into.
>> > 
>> > Even though it doesn't fix his case, it seems correct to me.
>> > 
>> > I was wondering if it was also important to set the skb->protocol
>> > before the call to ip_tunnel_encap() but I couldn't find a dependency.
>> > 
>> > In any event I'd like to see some other people review this change
>> > before I apply it.
>> > 
>> > My only other guess for Stephen's problem is somehow the SKB headers
>> > aren't set up properly for what the GSO engine expects.
>> 
>> Well, mlx4 just works, and uses GSO engine just fine.
>> 
>> So my guess is this is a bug in Intel IGB driver.
> 
> About Eli patch : I do not believe it is needed.
> 
> Here is the path followed by SIT packet being GSO at the device layer :
> 
> We can see that ip_output() was called, and ip_output() already does :
> 
> skb->protocol = htons(ETH_P_IP);

Hmmm, ip6_finish_output2() also does a proper assignment of skb->protocol,
therefore why was commit ae148b085876fa771d9ef2c05f85d4b4bf09ce0d
("ip6_tunnel: Update skb->protocol to ETH_P_IPV6 in ip6_tnl_xmit()")
necessary at all?


Re: [PATCH net] sit: Set skb->protocol properly in ipip6_tunnel_xmit()

2016-11-28 Thread Alexander Duyck
On Mon, Nov 28, 2016 at 8:53 AM, Eric Dumazet  wrote:
> On Mon, 2016-11-28 at 11:47 -0500, David Miller wrote:
>> From: Stephen Rothwell 
>> Date: Sun, 27 Nov 2016 13:04:00 +1100
>>
>> > [Just for Dave's information]
>> >
>> > On Fri, 25 Nov 2016 13:50:17 +0800 Eli Cooper  wrote:
>> >>
>> >> Similar to commit ae148b085876
>> >> ("ip6_tunnel: Update skb->protocol to ETH_P_IPV6 in ip6_tnl_xmit()"),
>> >> sit tunnels also need to update skb->protocol; otherwise, TSO/GSO packets
>> >> might not be properly segmented, which causes the packets being dropped.
>> >>
>> >> Reported-by: Stephen Rothwell 
>> >> Tested-by: Eli Cooper 
>> >> Cc: sta...@vger.kernel.org
>> >> Signed-off-by: Eli Cooper 
>> >
>> > I tested this patch and it does *not* solve my problem.
>>
>> I'm torn on this patch, because it looked exactly like it would solve the
>> kind of problem Stephen is running into.
>>
>> Even though it doesn't fix his case, it seems correct to me.
>>
>> I was wondering if it was also important to set the skb->protocol
>> before the call to ip_tunnel_encap() but I couldn't find a dependency.
>>
>> In any event I'd like to see some other people review this change
>> before I apply it.
>>
>> My only other guess for Stephen's problem is somehow the SKB headers
>> aren't set up properly for what the GSO engine expects.
>
> Well, mlx4 just works, and uses GSO engine just fine.
>
> So my guess is this is a bug in Intel IGB driver.
>
>
> Alexander, can you take a look ?
>
> Features for eth0:
> rx-checksumming: on
> tx-checksumming: on
> tx-checksum-ipv4: off [fixed]
> tx-checksum-ip-generic: on
> tx-checksum-ipv6: off [fixed]
> tx-checksum-fcoe-crc: off [fixed]
> tx-checksum-sctp: on
> scatter-gather: on
> tx-scatter-gather: on
> tx-scatter-gather-fraglist: off [fixed]
> tcp-segmentation-offload: on
> tx-tcp-segmentation: on
> tx-tcp-ecn-segmentation: off [fixed]
> tx-tcp-mangleid-segmentation: off
> tx-tcp6-segmentation: on
> udp-fragmentation-offload: off [fixed]
> generic-segmentation-offload: on
> generic-receive-offload: on
> large-receive-offload: off [fixed]
> rx-vlan-offload: on
> tx-vlan-offload: on
> ntuple-filters: off
> receive-hashing: on
> highdma: on [fixed]
> rx-vlan-filter: on [fixed]
> vlan-challenged: off [fixed]
> tx-lockless: off [fixed]
> netns-local: off [fixed]
> tx-gso-robust: off [fixed]
> tx-fcoe-segmentation: off [fixed]
> tx-gre-segmentation: on
> tx-gre-csum-segmentation: on
> tx-ipxip4-segmentation: on
> tx-ipxip6-segmentation: on
> tx-udp_tnl-segmentation: on
> tx-udp_tnl-csum-segmentation: on
> tx-gso-partial: on
> fcoe-mtu: off [fixed]
> tx-nocache-copy: off
> loopback: off [fixed]
> rx-fcs: off [fixed]
> rx-all: off
> tx-vlan-stag-hw-insert: off [fixed]
> rx-vlan-stag-hw-parse: off [fixed]
> rx-vlan-stag-filter: off [fixed]
> l2-fwd-offload: off [fixed]
> busy-poll: off [fixed]
> hw-tc-offload: off [fixed]

I agree.  This sounds like a bug with the GSO partial on igb.

I'll try setting up a SIT tunnel on my systems here and see if I can
reproduce the issue.

- Alex


Re: [PATCH net] sit: Set skb->protocol properly in ipip6_tunnel_xmit()

2016-11-28 Thread Eric Dumazet
On Mon, 2016-11-28 at 08:53 -0800, Eric Dumazet wrote:
> On Mon, 2016-11-28 at 11:47 -0500, David Miller wrote:
> > From: Stephen Rothwell 
> > Date: Sun, 27 Nov 2016 13:04:00 +1100
> > 
> > > [Just for Dave's information]
> > > 
> > > On Fri, 25 Nov 2016 13:50:17 +0800 Eli Cooper  wrote:
> > >>
> > >> Similar to commit ae148b085876
> > >> ("ip6_tunnel: Update skb->protocol to ETH_P_IPV6 in ip6_tnl_xmit()"),
> > >> sit tunnels also need to update skb->protocol; otherwise, TSO/GSO packets
> > >> might not be properly segmented, which causes the packets being dropped.
> > >> 
> > >> Reported-by: Stephen Rothwell 
> > >> Tested-by: Eli Cooper 
> > >> Cc: sta...@vger.kernel.org
> > >> Signed-off-by: Eli Cooper 
> > > 
> > > I tested this patch and it does *not* solve my problem.
> > 
> > I'm torn on this patch, because it looked exactly like it would solve the
> > kind of problem Stephen is running into.
> > 
> > Even though it doesn't fix his case, it seems correct to me.
> > 
> > I was wondering if it was also important to set the skb->protocol
> > before the call to ip_tunnel_encap() but I couldn't find a dependency.
> > 
> > In any event I'd like to see some other people review this change
> > before I apply it.
> > 
> > My only other guess for Stephen's problem is somehow the SKB headers
> > aren't set up properly for what the GSO engine expects.
> 
> Well, mlx4 just works, and uses GSO engine just fine.
> 
> So my guess is this is a bug in Intel IGB driver.

About Eli patch : I do not believe it is needed.

Here is the path followed by SIT packet being GSO at the device layer :

We can see that ip_output() was called, and ip_output() already does :

skb->protocol = htons(ETH_P_IP);

[   71.209437] sit: skb->protocol = dd86
[   71.215387] skb_mac_gso_segment type 8
[   71.219176] [ cut here ]
[   71.219180] WARNING: CPU: 25 PID: 12087 at net/core/dev.c:2642 
skb_mac_gso_segment+0x166/0x170
[   71.219200] CPU: 25 PID: 12087 Comm: netperf Not tainted 4.9.0-smp-DEV #362
[   71.219203]  c90019d4b4d8 813ca25b 0009 

[   71.219204]  c90019d4b518 810d0c13 0a5219d4b4f0 
0008
[   71.219205]  881ff2524ae8 0001 0001 
881fe1107a9c
[   71.219205] Call Trace:
[   71.219209]  [] dump_stack+0x4d/0x72
[   71.219213]  [] __warn+0xd3/0xf0
[   71.219214]  [] warn_slowpath_null+0x1e/0x20
[   71.219215]  [] skb_mac_gso_segment+0x166/0x170
[   71.219216]  [] __skb_gso_segment+0xb9/0x140
[   71.219218]  [] validate_xmit_skb+0x136/0x2d0
[   71.219219]  [] validate_xmit_skb_list+0x43/0x70
[   71.219221]  [] sch_direct_xmit+0x147/0x1a0
[   71.219223]  [] __dev_queue_xmit+0x421/0x620
[   71.219224]  [] dev_queue_xmit+0x10/0x20
[   71.219226]  [] ip_finish_output2+0x254/0x330
[   71.219227]  [] ip_finish_output+0x136/0x1d0
[   71.219228]  [] ip_output+0xab/0xc0
[   71.219230]  [] ? ip_fragment.constprop.58+0x80/0x80
[   71.219231]  [] ip_local_out+0x35/0x40
[   71.219233]  [] iptunnel_xmit+0x14c/0x1c0
[   71.219235]  [] sit_tunnel_xmit+0x50a/0x860
[   71.219236]  [] dev_hard_start_xmit+0xb0/0x200
[   71.219238]  [] __dev_queue_xmit+0x582/0x620
[   71.219239]  [] dev_queue_xmit+0x10/0x20
[   71.219241]  [] neigh_direct_output+0x11/0x20
[   71.219243]  [] ip6_finish_output2+0x181/0x460
[   71.219245]  [] ? ip6table_mangle_hook+0x3b/0x130
[   71.219246]  [] ? ip6_finish_output2+0x194/0x460
[   71.219247]  [] ? ip6table_mangle_hook+0x3b/0x130
[   71.219249]  [] ip6_finish_output+0xa6/0x110
[   71.219250]  [] ip6_output+0x95/0xe0
[   71.219251]  [] ? ip6_fragment+0x9e0/0x9e0
[   71.219252]  [] dst_output+0xf/0x20
[   71.219254]  [] NF_HOOK.constprop.45+0x77/0x90
[   71.219255]  [] ? ac6_proc_exit+0x20/0x20
[   71.219256]  [] ip6_xmit+0x268/0x4d0
[   71.219257]  [] ? ac6_proc_exit+0x20/0x20
[   71.219258]  [] ? ip6_xmit+0x268/0x4d0
[   71.219260]  [] ? inet6_csk_route_socket+0x120/0x1d0
[   71.219262]  [] ? __kmalloc_node_track_caller+0x31/0x40
[   71.219264]  [] inet6_csk_xmit+0x7e/0xc0
[   71.219265]  [] ? __kmalloc_node_track_caller+0x31/0x40
[   71.219267]  [] tcp_transmit_skb+0x547/0x910
[   71.219268]  [] tcp_write_xmit+0x3bd/0xf70
[   71.219270]  [] ? __radix_tree_create+0x2d0/0x360
[   71.219271]  [] tcp_push_one+0x34/0x40
[   71.219272]  [] tcp_sendmsg+0x511/0xbc0
[   71.219275]  [] inet_sendmsg+0x65/0xa0
[   71.219277]  [] sock_sendmsg+0x38/0x50
[   71.219278]  [] SYSC_sendto+0x106/0x170
[   71.219281]  [] ? do_setitimer+0x1f6/0x250
[   71.219282]  [] ? alarm_setitimer+0x41/0x70
[   71.219283]  [] SyS_sendto+0xe/0x10
[   71.219286]  [] entry_SYSCALL_64_fastpath+0x13/0x94




Re: [PATCH net] sit: Set skb->protocol properly in ipip6_tunnel_xmit()

2016-11-28 Thread Eric Dumazet
On Mon, 2016-11-28 at 11:47 -0500, David Miller wrote:
> From: Stephen Rothwell 
> Date: Sun, 27 Nov 2016 13:04:00 +1100
> 
> > [Just for Dave's information]
> > 
> > On Fri, 25 Nov 2016 13:50:17 +0800 Eli Cooper  wrote:
> >>
> >> Similar to commit ae148b085876
> >> ("ip6_tunnel: Update skb->protocol to ETH_P_IPV6 in ip6_tnl_xmit()"),
> >> sit tunnels also need to update skb->protocol; otherwise, TSO/GSO packets
> >> might not be properly segmented, which causes the packets being dropped.
> >> 
> >> Reported-by: Stephen Rothwell 
> >> Tested-by: Eli Cooper 
> >> Cc: sta...@vger.kernel.org
> >> Signed-off-by: Eli Cooper 
> > 
> > I tested this patch and it does *not* solve my problem.
> 
> I'm torn on this patch, because it looked exactly like it would solve the
> kind of problem Stephen is running into.
> 
> Even though it doesn't fix his case, it seems correct to me.
> 
> I was wondering if it was also important to set the skb->protocol
> before the call to ip_tunnel_encap() but I couldn't find a dependency.
> 
> In any event I'd like to see some other people review this change
> before I apply it.
> 
> My only other guess for Stephen's problem is somehow the SKB headers
> aren't set up properly for what the GSO engine expects.

Well, mlx4 just works, and uses GSO engine just fine.

So my guess is this is a bug in Intel IGB driver.


Alexander, can you take a look ?

Features for eth0:
rx-checksumming: on
tx-checksumming: on
tx-checksum-ipv4: off [fixed]
tx-checksum-ip-generic: on
tx-checksum-ipv6: off [fixed]
tx-checksum-fcoe-crc: off [fixed]
tx-checksum-sctp: on
scatter-gather: on
tx-scatter-gather: on
tx-scatter-gather-fraglist: off [fixed]
tcp-segmentation-offload: on
tx-tcp-segmentation: on
tx-tcp-ecn-segmentation: off [fixed]
tx-tcp-mangleid-segmentation: off
tx-tcp6-segmentation: on
udp-fragmentation-offload: off [fixed]
generic-segmentation-offload: on
generic-receive-offload: on
large-receive-offload: off [fixed]
rx-vlan-offload: on
tx-vlan-offload: on
ntuple-filters: off
receive-hashing: on
highdma: on [fixed]
rx-vlan-filter: on [fixed]
vlan-challenged: off [fixed]
tx-lockless: off [fixed]
netns-local: off [fixed]
tx-gso-robust: off [fixed]
tx-fcoe-segmentation: off [fixed]
tx-gre-segmentation: on
tx-gre-csum-segmentation: on
tx-ipxip4-segmentation: on
tx-ipxip6-segmentation: on
tx-udp_tnl-segmentation: on
tx-udp_tnl-csum-segmentation: on
tx-gso-partial: on
fcoe-mtu: off [fixed]
tx-nocache-copy: off
loopback: off [fixed]
rx-fcs: off [fixed]
rx-all: off
tx-vlan-stag-hw-insert: off [fixed]
rx-vlan-stag-hw-parse: off [fixed]
rx-vlan-stag-filter: off [fixed]
l2-fwd-offload: off [fixed]
busy-poll: off [fixed]
hw-tc-offload: off [fixed]






Re: [PATCH net] sit: Set skb->protocol properly in ipip6_tunnel_xmit()

2016-11-28 Thread David Miller
From: Stephen Rothwell 
Date: Sun, 27 Nov 2016 13:04:00 +1100

> [Just for Dave's information]
> 
> On Fri, 25 Nov 2016 13:50:17 +0800 Eli Cooper  wrote:
>>
>> Similar to commit ae148b085876
>> ("ip6_tunnel: Update skb->protocol to ETH_P_IPV6 in ip6_tnl_xmit()"),
>> sit tunnels also need to update skb->protocol; otherwise, TSO/GSO packets
>> might not be properly segmented, which causes the packets being dropped.
>> 
>> Reported-by: Stephen Rothwell 
>> Tested-by: Eli Cooper 
>> Cc: sta...@vger.kernel.org
>> Signed-off-by: Eli Cooper 
> 
> I tested this patch and it does *not* solve my problem.

I'm torn on this patch, because it looked exactly like it would solve the
kind of problem Stephen is running into.

Even though it doesn't fix his case, it seems correct to me.

I was wondering if it was also important to set the skb->protocol
before the call to ip_tunnel_encap() but I couldn't find a dependency.

In any event I'd like to see some other people review this change
before I apply it.

My only other guess for Stephen's problem is somehow the SKB headers
aren't set up properly for what the GSO engine expects.


Re: [PATCH net] sit: Set skb->protocol properly in ipip6_tunnel_xmit()

2016-11-26 Thread Stephen Rothwell
Hi Eli,

[Just for Dave's information]

On Fri, 25 Nov 2016 13:50:17 +0800 Eli Cooper  wrote:
>
> Similar to commit ae148b085876
> ("ip6_tunnel: Update skb->protocol to ETH_P_IPV6 in ip6_tnl_xmit()"),
> sit tunnels also need to update skb->protocol; otherwise, TSO/GSO packets
> might not be properly segmented, which causes the packets being dropped.
> 
> Reported-by: Stephen Rothwell 
> Tested-by: Eli Cooper 
> Cc: sta...@vger.kernel.org
> Signed-off-by: Eli Cooper 

I tested this patch and it does *not* solve my problem.

-- 
Cheers,
Stephen Rothwell


[PATCH net] sit: Set skb->protocol properly in ipip6_tunnel_xmit()

2016-11-24 Thread Eli Cooper
Similar to commit ae148b085876
("ip6_tunnel: Update skb->protocol to ETH_P_IPV6 in ip6_tnl_xmit()"),
sit tunnels also need to update skb->protocol; otherwise, TSO/GSO packets
might not be properly segmented, which causes the packets being dropped.

Reported-by: Stephen Rothwell 
Tested-by: Eli Cooper 
Cc: sta...@vger.kernel.org
Signed-off-by: Eli Cooper 
---
 net/ipv6/sit.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/net/ipv6/sit.c b/net/ipv6/sit.c
index b1cdf80..a05dceb 100644
--- a/net/ipv6/sit.c
+++ b/net/ipv6/sit.c
@@ -972,6 +972,7 @@ static netdev_tx_t ipip6_tunnel_xmit(struct sk_buff *skb,
}
 
skb_set_inner_ipproto(skb, IPPROTO_IPV6);
+   skb->protocol = htons(ETH_P_IP);
 
iptunnel_xmit(NULL, rt, skb, fl4.saddr, fl4.daddr, protocol, tos, ttl,
  df, !net_eq(tunnel->net, dev_net(dev)));
-- 
2.10.2