On Wed, Dec 20, 2017 at 2:38 AM, David Miller <da...@davemloft.net> wrote:
> From: Xin Long <lucien....@gmail.com>
> Date: Wed, 20 Dec 2017 01:05:32 +0800
>
>> On Wed, Dec 20, 2017 at 12:12 AM, David Miller <da...@davemloft.net> wrote:
>>> You're going to have to find a way to fix this without
>>> invoking ->update_pmtu() on every single transmit.  That's
>>> really excessive, especially for an operation which is
>>> going to be a NOP %99.9999 of the time.
>> understand, I couldn't find a better way,  and all iptunnels are
>> doing it in this way.
>>
>> Or is it possible to go with an unlikely here ?
>>
>>                 if (unlikely(skb_dst(skb) && mtu < dst_mtu(skb_dst(skb))))
>>                         skb_dst(skb)->ops->update_pmtu(skb_dst(skb), NULL,
>>                                                        skb, mtu);
>>
>>
>  ...
>> how about doing it in vxlan_get_route():
>> @@ -1896,6 +1896,13 @@ static struct rtable *vxlan_get_route(struct
>> vxlan_dev *vxlan, struct net_device
>>                 *saddr = fl4.saddr;
>>                 if (use_cache)
>>                         dst_cache_set_ip4(dst_cache, &rt->dst, fl4.saddr);
>> +
>> +               if (skb_dst(skb)) {
>> +                       int mtu = dst_mtu(ndst) - VXLAN_HEADROOM;
>> +
>> +                       skb_dst(skb)->ops->update_pmtu(skb_dst(skb), NULL,
>> +                                                      skb, mtu);
>> +               }
>>
>>
>> This would do it only when no dst_cache and it has to do real route lookup.
>>
>> Note that even when update_pmtu is hit, mostly it will do nothing and
>> just return
>> as usually new mtu >= skb_dst(skb)'s pmtu.
>
> Ok, yeah, this is really difficult.
>
> I'll apply your patch for now, but generally speaking we have to handle this
> issue better.
Thanks Dave.

I'm still thinking about how to support for all udp tunnels icmp packet process,
It's also difficult, as no udp sock for TX.

is it possible we do like the following patch, though I know it's pretty bad
to still update pmtu even if no sock is found. do you have any suggestion
about this ?

diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
index e4ff25c..b3d2a50 100644
--- a/net/ipv4/udp.c
+++ b/net/ipv4/udp.c
@@ -608,6 +608,36 @@ static inline bool __udp_is_mcast_sock(struct net
*net, struct sock *sk,
        return true;
 }

+static int udp4_encap_err_handler(struct sk_buff *skb, u32 info)
+{
+       struct iphdr *iph = (struct iphdr *)skb->data;
+       struct net *net = dev_net(skb->dev);
+       int type = icmp_hdr(skb)->type;
+       int code = icmp_hdr(skb)->code;
+
+       switch (type) {
+       case ICMP_DEST_UNREACH:
+               if (code == ICMP_FRAG_NEEDED) {
+                       ipv4_update_pmtu(skb, net, info, 0, 0,
+                                        iph->protocol, 0);
+                       break;
+               } else if (code == ICMP_SR_FAILED) {
+                       break;
+               }
+               return -ENOENT;
+       case ICMP_TIME_EXCEEDED:
+               if (code != ICMP_EXC_TTL)
+                       break;
+               return -ENOENT;
+       case ICMP_REDIRECT:
+               ipv4_redirect(skb, net, 0, 0, iph->protocol, 0);
+       default:
+               break;
+       }
+
 /*
  * This routine is called by the ICMP module when it gets some
  * sort of error condition.  If err < 0 then the socket should
@@ -635,7 +665,8 @@ void __udp4_lib_err(struct sk_buff *skb, u32 info,
struct udp_table *udptable)
                               iph->saddr, uh->source, skb->dev->ifindex, 0,
                               udptable, NULL);
        if (!sk) {
-               __ICMP_INC_STATS(net, ICMP_MIB_INERRORS);
+               if (udp4_encap_err_handler(skb, info))
+                       __ICMP_INC_STATS(net, ICMP_MIB_INERRORS);
                return; /* No socket for error */
        }

Reply via email to