On Wed, Dec 20, 2017 at 12:12 AM, David Miller <da...@davemloft.net> wrote:
> From: Xin Long <lucien....@gmail.com>
> Date: Mon, 18 Dec 2017 14:20:56 +0800
>
>> Unlike ip tunnels, now vxlan doesn't do any pmtu update for
>> upper dst pmtu, even if it doesn't match the lower dst pmtu
>> any more.
>>
>> The problem can be reproduced when reducing the vxlan lower
>> dev's pmtu when running netperf. In jianlin's testing, the
>> performance went to 1/7 of the previous.
>>
>> This patch is to update the upper dst pmtu to match the lower
>> dst pmtu on tx path so that packets can be sent out even when
>> lower dev's pmtu has been changed.
>>
>> It also works for metadata dst.
>>
>> Note that this patch doesn't process any pmtu icmp packet.
>> But even in the future, the support for pmtu icmp packets
>> process of udp tunnels will also needs this.
>>
>> The same thing will be done for geneve in another patch.
>>
>> Signed-off-by: Xin Long <lucien....@gmail.com>
>
> Yikes...
>
> You're going to have to find a way to fix this without
> invoking ->update_pmtu() on every single transmit.  That's
> really excessive, especially for an operation which is
> going to be a NOP %99.9999 of the time.
understand, I couldn't find a better way,  and all iptunnels are
doing it in this way.

Or is it possible to go with an unlikely here ?

                if (unlikely(skb_dst(skb) && mtu < dst_mtu(skb_dst(skb))))
                        skb_dst(skb)->ops->update_pmtu(skb_dst(skb), NULL,
                                                       skb, mtu);



>
> We need some way, instead, for the MTU change event to propagate
> properly.  I know this might be hard, but doing this in the transmit
> handler on every packet to deal with it is not the way to go.
how about doing it in vxlan_get_route():
@@ -1896,6 +1896,13 @@ static struct rtable *vxlan_get_route(struct
vxlan_dev *vxlan, struct net_device
                *saddr = fl4.saddr;
                if (use_cache)
                        dst_cache_set_ip4(dst_cache, &rt->dst, fl4.saddr);
+
+               if (skb_dst(skb)) {
+                       int mtu = dst_mtu(ndst) - VXLAN_HEADROOM;
+
+                       skb_dst(skb)->ops->update_pmtu(skb_dst(skb), NULL,
+                                                      skb, mtu);
+               }


This would do it only when no dst_cache and it has to do real route lookup.

Note that even when update_pmtu is hit, mostly it will do nothing and
just return
as usually new mtu >= skb_dst(skb)'s pmtu.


>
> Thanks.
>

Reply via email to