Re: [PATCH for v4.9 LTS 86/87] net: account for current skb length when deciding about UFO

2017-07-24 Thread Levin, Alexander (Sasha Levin)
On Sat, Jul 15, 2017 at 10:53:27AM +0200, Michal Kubecek wrote:
>On Sat, Jul 15, 2017 at 01:26:27AM +, Levin, Alexander (Sasha Levin) wrote:
>> From: Michal Kubeček 
>>
>> [ Upstream commit a5cb659bbc1c8644efa0c3138a757a1e432a4880 ]
>>
>> Our customer encountered stuck NFS writes for blocks starting at specific
>> offsets w.r.t. page boundary caused by networking stack sending packets via
>> UFO enabled device with wrong checksum. The problem can be reproduced by
>> composing a long UDP datagram from multiple parts using MSG_MORE flag:
>>
>>   sendto(sd, buff, 1000, MSG_MORE, ...);
>>   sendto(sd, buff, 1000, MSG_MORE, ...);
>>   sendto(sd, buff, 3000, 0, ...);
>>
>> Assume this packet is to be routed via a device with MTU 1500 and
>> NETIF_F_UFO enabled. When second sendto() gets into __ip_append_data(),
>> this condition is tested (among others) to decide whether to call
>> ip_ufo_append_data():
>>
>>   ((length + fragheaderlen) > mtu) || (skb && skb_is_gso(skb))
>>
>> At the moment, we already have skb with 1028 bytes of data which is not
>> marked for GSO so that the test is false (fragheaderlen is usually 20).
>> Thus we append second 1000 bytes to this skb without invoking UFO. Third
>> sendto(), however, has sufficient length to trigger the UFO path so that we
>> end up with non-UFO skb followed by a UFO one. Later on, udp_send_skb()
>> uses udp_csum() to calculate the checksum but that assumes all fragments
>> have correct checksum in skb->csum which is not true for UFO fragments.
>>
>> When checking against MTU, we need to add skb->len to length of new segment
>> if we already have a partially filled skb and fragheaderlen only if there
>> isn't one.
>>
>> In the IPv6 case, skb can only be null if this is the first segment so that
>> we have to use headersize (length of the first IPv6 header) rather than
>> fragheaderlen (length of IPv6 header of further fragments) for skb == NULL.
>>
>> Fixes: e89e9cf539a2 ("[IPv4/IPv6]: UFO Scatter-gather approach")
>> Fixes: e4c5e13aa45c ("ipv6: Should use consistent conditional judgement for
>>  ip6 fragment between __ip6_append_data and ip6_finish_output")
>> Signed-off-by: Michal Kubecek 
>> Acked-by: Vlad Yasevich 
>> Signed-off-by: David S. Miller 
>> [SL: Drop changes to net/ipv4/ip_output.c because 4.9 doesn't have
>> e89e9cf539a2 ("[IPv4/IPv6]: UFO Scatter-gather approach")]
>
>Commit e89e9cf539a2 is in mainline since v2.6.16.28 so that 4.9 does
>have it. The conflict on cherry-pick is caused by missing commit
>e4c5e13aa45c but that is the same as in the IPv6 part. I'm adding the
>IPv4 part backported to 4.9 below
>
>> Signed-off-by: Sasha Levin 
>> ---
>>  net/ipv6/ip6_output.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
>> index 5a4b8e7bcedd..9213eba601d0 100644
>> --- a/net/ipv6/ip6_output.c
>> +++ b/net/ipv6/ip6_output.c
>> @@ -1376,7 +1376,7 @@ static int __ip6_append_data(struct sock *sk,
>>   */
>>
>>  cork->length += length;
>> -if length + fragheaderlen) > mtu) ||
>> +if length + (skb ? skb->len : headersize)) > mtu) ||
>>   (skb && skb_is_gso(skb))) &&
>>  (sk->sk_protocol == IPPROTO_UDP) &&
>>  (rt->dst.dev->features & NETIF_F_UFO) && !rt->dst.header_len &&
>> --
>> 2.11.0
>
>This is strange, it looks as if e4c5e13aa45c was already applied before
>but I can't see that in stable-4.9.y branch.

It was added in this patchset.

>diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c
>index e5c1dbef3626..06215ba88b93 100644
>--- a/net/ipv4/ip_output.c
>+++ b/net/ipv4/ip_output.c
>@@ -936,7 +936,8 @@ static int __ip_append_data(struct sock *sk,
>   csummode = CHECKSUM_PARTIAL;
>
>   cork->length += length;
>-  if (((length > mtu) || (skb && skb_is_gso(skb))) &&
>+  if length + (skb ? skb->len : fragheaderlen)) > mtu) ||
>+   (skb && skb_is_gso(skb))) &&
>   (sk->sk_protocol == IPPROTO_UDP) &&
>   (rt->dst.dev->features & NETIF_F_UFO) && !rt->dst.header_len &&
>   (sk->sk_type == SOCK_DGRAM) && !sk->sk_no_check_tx) {
>diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
>index fd649599620e..9213eba601d0 100644
>--- a/net/ipv6/ip6_output.c
>+++ b/net/ipv6/ip6_output.c
>@@ -1376,7 +1376,7 @@ static int __ip6_append_data(struct sock *sk,
>*/
>
>   cork->length += length;
>-  if (((length > mtu) ||
>+  if length + (skb ? skb->len : headersize)) > mtu) ||
>(skb && skb_is_gso(skb))) &&
>   (sk->sk_protocol == IPPROTO_UDP) &&
>   (rt->dst.dev->features & NETIF_F_UFO) && !rt->dst.header_len &&

Thanks!

-- 

Thanks,
Sasha

Re: [PATCH for v4.9 LTS 86/87] net: account for current skb length when deciding about UFO

2017-07-24 Thread Levin, Alexander (Sasha Levin)
On Sat, Jul 15, 2017 at 10:53:27AM +0200, Michal Kubecek wrote:
>On Sat, Jul 15, 2017 at 01:26:27AM +, Levin, Alexander (Sasha Levin) wrote:
>> From: Michal Kubeček 
>>
>> [ Upstream commit a5cb659bbc1c8644efa0c3138a757a1e432a4880 ]
>>
>> Our customer encountered stuck NFS writes for blocks starting at specific
>> offsets w.r.t. page boundary caused by networking stack sending packets via
>> UFO enabled device with wrong checksum. The problem can be reproduced by
>> composing a long UDP datagram from multiple parts using MSG_MORE flag:
>>
>>   sendto(sd, buff, 1000, MSG_MORE, ...);
>>   sendto(sd, buff, 1000, MSG_MORE, ...);
>>   sendto(sd, buff, 3000, 0, ...);
>>
>> Assume this packet is to be routed via a device with MTU 1500 and
>> NETIF_F_UFO enabled. When second sendto() gets into __ip_append_data(),
>> this condition is tested (among others) to decide whether to call
>> ip_ufo_append_data():
>>
>>   ((length + fragheaderlen) > mtu) || (skb && skb_is_gso(skb))
>>
>> At the moment, we already have skb with 1028 bytes of data which is not
>> marked for GSO so that the test is false (fragheaderlen is usually 20).
>> Thus we append second 1000 bytes to this skb without invoking UFO. Third
>> sendto(), however, has sufficient length to trigger the UFO path so that we
>> end up with non-UFO skb followed by a UFO one. Later on, udp_send_skb()
>> uses udp_csum() to calculate the checksum but that assumes all fragments
>> have correct checksum in skb->csum which is not true for UFO fragments.
>>
>> When checking against MTU, we need to add skb->len to length of new segment
>> if we already have a partially filled skb and fragheaderlen only if there
>> isn't one.
>>
>> In the IPv6 case, skb can only be null if this is the first segment so that
>> we have to use headersize (length of the first IPv6 header) rather than
>> fragheaderlen (length of IPv6 header of further fragments) for skb == NULL.
>>
>> Fixes: e89e9cf539a2 ("[IPv4/IPv6]: UFO Scatter-gather approach")
>> Fixes: e4c5e13aa45c ("ipv6: Should use consistent conditional judgement for
>>  ip6 fragment between __ip6_append_data and ip6_finish_output")
>> Signed-off-by: Michal Kubecek 
>> Acked-by: Vlad Yasevich 
>> Signed-off-by: David S. Miller 
>> [SL: Drop changes to net/ipv4/ip_output.c because 4.9 doesn't have
>> e89e9cf539a2 ("[IPv4/IPv6]: UFO Scatter-gather approach")]
>
>Commit e89e9cf539a2 is in mainline since v2.6.16.28 so that 4.9 does
>have it. The conflict on cherry-pick is caused by missing commit
>e4c5e13aa45c but that is the same as in the IPv6 part. I'm adding the
>IPv4 part backported to 4.9 below
>
>> Signed-off-by: Sasha Levin 
>> ---
>>  net/ipv6/ip6_output.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
>> index 5a4b8e7bcedd..9213eba601d0 100644
>> --- a/net/ipv6/ip6_output.c
>> +++ b/net/ipv6/ip6_output.c
>> @@ -1376,7 +1376,7 @@ static int __ip6_append_data(struct sock *sk,
>>   */
>>
>>  cork->length += length;
>> -if length + fragheaderlen) > mtu) ||
>> +if length + (skb ? skb->len : headersize)) > mtu) ||
>>   (skb && skb_is_gso(skb))) &&
>>  (sk->sk_protocol == IPPROTO_UDP) &&
>>  (rt->dst.dev->features & NETIF_F_UFO) && !rt->dst.header_len &&
>> --
>> 2.11.0
>
>This is strange, it looks as if e4c5e13aa45c was already applied before
>but I can't see that in stable-4.9.y branch.

It was added in this patchset.

>diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c
>index e5c1dbef3626..06215ba88b93 100644
>--- a/net/ipv4/ip_output.c
>+++ b/net/ipv4/ip_output.c
>@@ -936,7 +936,8 @@ static int __ip_append_data(struct sock *sk,
>   csummode = CHECKSUM_PARTIAL;
>
>   cork->length += length;
>-  if (((length > mtu) || (skb && skb_is_gso(skb))) &&
>+  if length + (skb ? skb->len : fragheaderlen)) > mtu) ||
>+   (skb && skb_is_gso(skb))) &&
>   (sk->sk_protocol == IPPROTO_UDP) &&
>   (rt->dst.dev->features & NETIF_F_UFO) && !rt->dst.header_len &&
>   (sk->sk_type == SOCK_DGRAM) && !sk->sk_no_check_tx) {
>diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
>index fd649599620e..9213eba601d0 100644
>--- a/net/ipv6/ip6_output.c
>+++ b/net/ipv6/ip6_output.c
>@@ -1376,7 +1376,7 @@ static int __ip6_append_data(struct sock *sk,
>*/
>
>   cork->length += length;
>-  if (((length > mtu) ||
>+  if length + (skb ? skb->len : headersize)) > mtu) ||
>(skb && skb_is_gso(skb))) &&
>   (sk->sk_protocol == IPPROTO_UDP) &&
>   (rt->dst.dev->features & NETIF_F_UFO) && !rt->dst.header_len &&

Thanks!

-- 

Thanks,
Sasha

Re: [PATCH for v4.9 LTS 86/87] net: account for current skb length when deciding about UFO

2017-07-15 Thread Michal Kubecek
On Sat, Jul 15, 2017 at 01:26:27AM +, Levin, Alexander (Sasha Levin) wrote:
> From: Michal Kubeček 
> 
> [ Upstream commit a5cb659bbc1c8644efa0c3138a757a1e432a4880 ]
> 
> Our customer encountered stuck NFS writes for blocks starting at specific
> offsets w.r.t. page boundary caused by networking stack sending packets via
> UFO enabled device with wrong checksum. The problem can be reproduced by
> composing a long UDP datagram from multiple parts using MSG_MORE flag:
> 
>   sendto(sd, buff, 1000, MSG_MORE, ...);
>   sendto(sd, buff, 1000, MSG_MORE, ...);
>   sendto(sd, buff, 3000, 0, ...);
> 
> Assume this packet is to be routed via a device with MTU 1500 and
> NETIF_F_UFO enabled. When second sendto() gets into __ip_append_data(),
> this condition is tested (among others) to decide whether to call
> ip_ufo_append_data():
> 
>   ((length + fragheaderlen) > mtu) || (skb && skb_is_gso(skb))
> 
> At the moment, we already have skb with 1028 bytes of data which is not
> marked for GSO so that the test is false (fragheaderlen is usually 20).
> Thus we append second 1000 bytes to this skb without invoking UFO. Third
> sendto(), however, has sufficient length to trigger the UFO path so that we
> end up with non-UFO skb followed by a UFO one. Later on, udp_send_skb()
> uses udp_csum() to calculate the checksum but that assumes all fragments
> have correct checksum in skb->csum which is not true for UFO fragments.
> 
> When checking against MTU, we need to add skb->len to length of new segment
> if we already have a partially filled skb and fragheaderlen only if there
> isn't one.
> 
> In the IPv6 case, skb can only be null if this is the first segment so that
> we have to use headersize (length of the first IPv6 header) rather than
> fragheaderlen (length of IPv6 header of further fragments) for skb == NULL.
> 
> Fixes: e89e9cf539a2 ("[IPv4/IPv6]: UFO Scatter-gather approach")
> Fixes: e4c5e13aa45c ("ipv6: Should use consistent conditional judgement for
>   ip6 fragment between __ip6_append_data and ip6_finish_output")
> Signed-off-by: Michal Kubecek 
> Acked-by: Vlad Yasevich 
> Signed-off-by: David S. Miller 
> [SL: Drop changes to net/ipv4/ip_output.c because 4.9 doesn't have
> e89e9cf539a2 ("[IPv4/IPv6]: UFO Scatter-gather approach")]

Commit e89e9cf539a2 is in mainline since v2.6.16.28 so that 4.9 does
have it. The conflict on cherry-pick is caused by missing commit
e4c5e13aa45c but that is the same as in the IPv6 part. I'm adding the
IPv4 part backported to 4.9 below

> Signed-off-by: Sasha Levin 
> ---
>  net/ipv6/ip6_output.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
> index 5a4b8e7bcedd..9213eba601d0 100644
> --- a/net/ipv6/ip6_output.c
> +++ b/net/ipv6/ip6_output.c
> @@ -1376,7 +1376,7 @@ static int __ip6_append_data(struct sock *sk,
>*/
>  
>   cork->length += length;
> - if length + fragheaderlen) > mtu) ||
> + if length + (skb ? skb->len : headersize)) > mtu) ||
>(skb && skb_is_gso(skb))) &&
>   (sk->sk_protocol == IPPROTO_UDP) &&
>   (rt->dst.dev->features & NETIF_F_UFO) && !rt->dst.header_len &&
> -- 
> 2.11.0

This is strange, it looks as if e4c5e13aa45c was already applied before
but I can't see that in stable-4.9.y branch.


diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c
index e5c1dbef3626..06215ba88b93 100644
--- a/net/ipv4/ip_output.c
+++ b/net/ipv4/ip_output.c
@@ -936,7 +936,8 @@ static int __ip_append_data(struct sock *sk,
csummode = CHECKSUM_PARTIAL;
 
cork->length += length;
-   if (((length > mtu) || (skb && skb_is_gso(skb))) &&
+   if length + (skb ? skb->len : fragheaderlen)) > mtu) ||
+(skb && skb_is_gso(skb))) &&
(sk->sk_protocol == IPPROTO_UDP) &&
(rt->dst.dev->features & NETIF_F_UFO) && !rt->dst.header_len &&
(sk->sk_type == SOCK_DGRAM) && !sk->sk_no_check_tx) {
diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
index fd649599620e..9213eba601d0 100644
--- a/net/ipv6/ip6_output.c
+++ b/net/ipv6/ip6_output.c
@@ -1376,7 +1376,7 @@ static int __ip6_append_data(struct sock *sk,
 */
 
cork->length += length;
-   if (((length > mtu) ||
+   if length + (skb ? skb->len : headersize)) > mtu) ||
 (skb && skb_is_gso(skb))) &&
(sk->sk_protocol == IPPROTO_UDP) &&
(rt->dst.dev->features & NETIF_F_UFO) && !rt->dst.header_len &&


Re: [PATCH for v4.9 LTS 86/87] net: account for current skb length when deciding about UFO

2017-07-15 Thread Michal Kubecek
On Sat, Jul 15, 2017 at 01:26:27AM +, Levin, Alexander (Sasha Levin) wrote:
> From: Michal Kubeček 
> 
> [ Upstream commit a5cb659bbc1c8644efa0c3138a757a1e432a4880 ]
> 
> Our customer encountered stuck NFS writes for blocks starting at specific
> offsets w.r.t. page boundary caused by networking stack sending packets via
> UFO enabled device with wrong checksum. The problem can be reproduced by
> composing a long UDP datagram from multiple parts using MSG_MORE flag:
> 
>   sendto(sd, buff, 1000, MSG_MORE, ...);
>   sendto(sd, buff, 1000, MSG_MORE, ...);
>   sendto(sd, buff, 3000, 0, ...);
> 
> Assume this packet is to be routed via a device with MTU 1500 and
> NETIF_F_UFO enabled. When second sendto() gets into __ip_append_data(),
> this condition is tested (among others) to decide whether to call
> ip_ufo_append_data():
> 
>   ((length + fragheaderlen) > mtu) || (skb && skb_is_gso(skb))
> 
> At the moment, we already have skb with 1028 bytes of data which is not
> marked for GSO so that the test is false (fragheaderlen is usually 20).
> Thus we append second 1000 bytes to this skb without invoking UFO. Third
> sendto(), however, has sufficient length to trigger the UFO path so that we
> end up with non-UFO skb followed by a UFO one. Later on, udp_send_skb()
> uses udp_csum() to calculate the checksum but that assumes all fragments
> have correct checksum in skb->csum which is not true for UFO fragments.
> 
> When checking against MTU, we need to add skb->len to length of new segment
> if we already have a partially filled skb and fragheaderlen only if there
> isn't one.
> 
> In the IPv6 case, skb can only be null if this is the first segment so that
> we have to use headersize (length of the first IPv6 header) rather than
> fragheaderlen (length of IPv6 header of further fragments) for skb == NULL.
> 
> Fixes: e89e9cf539a2 ("[IPv4/IPv6]: UFO Scatter-gather approach")
> Fixes: e4c5e13aa45c ("ipv6: Should use consistent conditional judgement for
>   ip6 fragment between __ip6_append_data and ip6_finish_output")
> Signed-off-by: Michal Kubecek 
> Acked-by: Vlad Yasevich 
> Signed-off-by: David S. Miller 
> [SL: Drop changes to net/ipv4/ip_output.c because 4.9 doesn't have
> e89e9cf539a2 ("[IPv4/IPv6]: UFO Scatter-gather approach")]

Commit e89e9cf539a2 is in mainline since v2.6.16.28 so that 4.9 does
have it. The conflict on cherry-pick is caused by missing commit
e4c5e13aa45c but that is the same as in the IPv6 part. I'm adding the
IPv4 part backported to 4.9 below

> Signed-off-by: Sasha Levin 
> ---
>  net/ipv6/ip6_output.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
> index 5a4b8e7bcedd..9213eba601d0 100644
> --- a/net/ipv6/ip6_output.c
> +++ b/net/ipv6/ip6_output.c
> @@ -1376,7 +1376,7 @@ static int __ip6_append_data(struct sock *sk,
>*/
>  
>   cork->length += length;
> - if length + fragheaderlen) > mtu) ||
> + if length + (skb ? skb->len : headersize)) > mtu) ||
>(skb && skb_is_gso(skb))) &&
>   (sk->sk_protocol == IPPROTO_UDP) &&
>   (rt->dst.dev->features & NETIF_F_UFO) && !rt->dst.header_len &&
> -- 
> 2.11.0

This is strange, it looks as if e4c5e13aa45c was already applied before
but I can't see that in stable-4.9.y branch.


diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c
index e5c1dbef3626..06215ba88b93 100644
--- a/net/ipv4/ip_output.c
+++ b/net/ipv4/ip_output.c
@@ -936,7 +936,8 @@ static int __ip_append_data(struct sock *sk,
csummode = CHECKSUM_PARTIAL;
 
cork->length += length;
-   if (((length > mtu) || (skb && skb_is_gso(skb))) &&
+   if length + (skb ? skb->len : fragheaderlen)) > mtu) ||
+(skb && skb_is_gso(skb))) &&
(sk->sk_protocol == IPPROTO_UDP) &&
(rt->dst.dev->features & NETIF_F_UFO) && !rt->dst.header_len &&
(sk->sk_type == SOCK_DGRAM) && !sk->sk_no_check_tx) {
diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
index fd649599620e..9213eba601d0 100644
--- a/net/ipv6/ip6_output.c
+++ b/net/ipv6/ip6_output.c
@@ -1376,7 +1376,7 @@ static int __ip6_append_data(struct sock *sk,
 */
 
cork->length += length;
-   if (((length > mtu) ||
+   if length + (skb ? skb->len : headersize)) > mtu) ||
 (skb && skb_is_gso(skb))) &&
(sk->sk_protocol == IPPROTO_UDP) &&
(rt->dst.dev->features & NETIF_F_UFO) && !rt->dst.header_len &&


[PATCH for v4.9 LTS 86/87] net: account for current skb length when deciding about UFO

2017-07-14 Thread Levin, Alexander (Sasha Levin)
From: Michal Kubeček 

[ Upstream commit a5cb659bbc1c8644efa0c3138a757a1e432a4880 ]

Our customer encountered stuck NFS writes for blocks starting at specific
offsets w.r.t. page boundary caused by networking stack sending packets via
UFO enabled device with wrong checksum. The problem can be reproduced by
composing a long UDP datagram from multiple parts using MSG_MORE flag:

  sendto(sd, buff, 1000, MSG_MORE, ...);
  sendto(sd, buff, 1000, MSG_MORE, ...);
  sendto(sd, buff, 3000, 0, ...);

Assume this packet is to be routed via a device with MTU 1500 and
NETIF_F_UFO enabled. When second sendto() gets into __ip_append_data(),
this condition is tested (among others) to decide whether to call
ip_ufo_append_data():

  ((length + fragheaderlen) > mtu) || (skb && skb_is_gso(skb))

At the moment, we already have skb with 1028 bytes of data which is not
marked for GSO so that the test is false (fragheaderlen is usually 20).
Thus we append second 1000 bytes to this skb without invoking UFO. Third
sendto(), however, has sufficient length to trigger the UFO path so that we
end up with non-UFO skb followed by a UFO one. Later on, udp_send_skb()
uses udp_csum() to calculate the checksum but that assumes all fragments
have correct checksum in skb->csum which is not true for UFO fragments.

When checking against MTU, we need to add skb->len to length of new segment
if we already have a partially filled skb and fragheaderlen only if there
isn't one.

In the IPv6 case, skb can only be null if this is the first segment so that
we have to use headersize (length of the first IPv6 header) rather than
fragheaderlen (length of IPv6 header of further fragments) for skb == NULL.

Fixes: e89e9cf539a2 ("[IPv4/IPv6]: UFO Scatter-gather approach")
Fixes: e4c5e13aa45c ("ipv6: Should use consistent conditional judgement for
ip6 fragment between __ip6_append_data and ip6_finish_output")
Signed-off-by: Michal Kubecek 
Acked-by: Vlad Yasevich 
Signed-off-by: David S. Miller 
[SL: Drop changes to net/ipv4/ip_output.c because 4.9 doesn't have
e89e9cf539a2 ("[IPv4/IPv6]: UFO Scatter-gather approach")]
Signed-off-by: Sasha Levin 
---
 net/ipv6/ip6_output.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
index 5a4b8e7bcedd..9213eba601d0 100644
--- a/net/ipv6/ip6_output.c
+++ b/net/ipv6/ip6_output.c
@@ -1376,7 +1376,7 @@ static int __ip6_append_data(struct sock *sk,
 */
 
cork->length += length;
-   if length + fragheaderlen) > mtu) ||
+   if length + (skb ? skb->len : headersize)) > mtu) ||
 (skb && skb_is_gso(skb))) &&
(sk->sk_protocol == IPPROTO_UDP) &&
(rt->dst.dev->features & NETIF_F_UFO) && !rt->dst.header_len &&
-- 
2.11.0


[PATCH for v4.9 LTS 86/87] net: account for current skb length when deciding about UFO

2017-07-14 Thread Levin, Alexander (Sasha Levin)
From: Michal Kubeček 

[ Upstream commit a5cb659bbc1c8644efa0c3138a757a1e432a4880 ]

Our customer encountered stuck NFS writes for blocks starting at specific
offsets w.r.t. page boundary caused by networking stack sending packets via
UFO enabled device with wrong checksum. The problem can be reproduced by
composing a long UDP datagram from multiple parts using MSG_MORE flag:

  sendto(sd, buff, 1000, MSG_MORE, ...);
  sendto(sd, buff, 1000, MSG_MORE, ...);
  sendto(sd, buff, 3000, 0, ...);

Assume this packet is to be routed via a device with MTU 1500 and
NETIF_F_UFO enabled. When second sendto() gets into __ip_append_data(),
this condition is tested (among others) to decide whether to call
ip_ufo_append_data():

  ((length + fragheaderlen) > mtu) || (skb && skb_is_gso(skb))

At the moment, we already have skb with 1028 bytes of data which is not
marked for GSO so that the test is false (fragheaderlen is usually 20).
Thus we append second 1000 bytes to this skb without invoking UFO. Third
sendto(), however, has sufficient length to trigger the UFO path so that we
end up with non-UFO skb followed by a UFO one. Later on, udp_send_skb()
uses udp_csum() to calculate the checksum but that assumes all fragments
have correct checksum in skb->csum which is not true for UFO fragments.

When checking against MTU, we need to add skb->len to length of new segment
if we already have a partially filled skb and fragheaderlen only if there
isn't one.

In the IPv6 case, skb can only be null if this is the first segment so that
we have to use headersize (length of the first IPv6 header) rather than
fragheaderlen (length of IPv6 header of further fragments) for skb == NULL.

Fixes: e89e9cf539a2 ("[IPv4/IPv6]: UFO Scatter-gather approach")
Fixes: e4c5e13aa45c ("ipv6: Should use consistent conditional judgement for
ip6 fragment between __ip6_append_data and ip6_finish_output")
Signed-off-by: Michal Kubecek 
Acked-by: Vlad Yasevich 
Signed-off-by: David S. Miller 
[SL: Drop changes to net/ipv4/ip_output.c because 4.9 doesn't have
e89e9cf539a2 ("[IPv4/IPv6]: UFO Scatter-gather approach")]
Signed-off-by: Sasha Levin 
---
 net/ipv6/ip6_output.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
index 5a4b8e7bcedd..9213eba601d0 100644
--- a/net/ipv6/ip6_output.c
+++ b/net/ipv6/ip6_output.c
@@ -1376,7 +1376,7 @@ static int __ip6_append_data(struct sock *sk,
 */
 
cork->length += length;
-   if length + fragheaderlen) > mtu) ||
+   if length + (skb ? skb->len : headersize)) > mtu) ||
 (skb && skb_is_gso(skb))) &&
(sk->sk_protocol == IPPROTO_UDP) &&
(rt->dst.dev->features & NETIF_F_UFO) && !rt->dst.header_len &&
-- 
2.11.0