Hi Cascardo,

We haven't done testing on these proposals however we managed to test and find 
out what really causes the performance drop in the commit. These proposals are 
trying to avoid that situation.
As I mentioned earlier, there are two bottlenecks introduced by the commit.

1) In the pkt_metadata_init the " md->tunnel.ipv6_dst = in6addr_any;" causes 
the performance drop because every packet has to spend extra cycle to do this 
copy of 16 byte. 
2) Before the commit, the presence of tunnel header is validated by simply 
checking the " flow->tunnel.ip_dst". But this commit introduces an additional 
check for 16 byte ipv6 header(flow_tnl_dst_is_set) along with it. 

Now the proposals given below obviate this initialization and the validation 
steps for 16 byte ipv6 address. Rather it initialize and check on a newly 
introduced flag. 
I didn't understand correctly what you mean by IPv4-mapped addresses? Would you 
use ipv6 struct for both ip and ipv6 address in the future. In that case, the 
cost of initializing 16 byte ipv6 address still exists, though we don't gain on 
comparison. I feel that validating a 8/16/32 bit flag needs only the same CPU 
cycles as the validation of 32 bit ip-address. We also verified that more than 
half of the performance drop is introduced by the initialization of ipv6 
address itself.

We haven't done testing for large/medium packets because as we know where the 
additional CPU cycles went.


Regards
_Sugesh


-----Original Message-----
From: casca...@redhat.com [mailto:casca...@redhat.com] 
Sent: Friday, December 11, 2015 1:48 PM
To: Chandran, Sugesh <sugesh.chand...@intel.com>
Cc: dev@openvswitch.org; jb...@redhat.com
Subject: Re: ipv6 tunneling: Fix for performance drop introduced by ipv6 tunnel 
support.

On Fri, Dec 11, 2015 at 01:10:13PM +0000, Chandran, Sugesh wrote:
> Hi,
> We found that there is a serious performance drop in OVS userspace-datapath 
> after the commit "tunneling: add IPv6 support to netdev_tunnel_config(commit 
> ID :- 3ae91c019019889fbe93aa7dfd6e3747da362b24)" in the OVS tree.  The 
> PHY-PHY throughput dropped almost 25% for 64 byte packets due to this 
> check-in. 
> We nailed down the root cause of the issue that the newly added ipv6 metadata 
> initialization and validation to check if set/not in the packet processing 
> causing the performance drop. We are having two proposal to fix the issue as 
> below.
> 
> Proposal: 1
> ========
> This approach uses a flag 'tun_type' for deciding whether the tunnel metadata 
> initialized/not. If yes then this flag will decide what type of tunnel it is.
> This way the overhead of memsetting  the ipv6 address will go away and also 
> the validation of tunnel data can be done using the newly introduced flag.
> Please note that this approach uses union of ipv6 and ipv4 tunnel data and 
> the flag decide which one is valid for the specific packet. I hope this will 
> avoid the unnecessary memory usage for different tunnels.
> 
> diff --git a/lib/packets.h b/lib/packets.h index edf140b..d68d017 
> 100644
> --- a/lib/packets.h
> +++ b/lib/packets.h
> @@ -35,12 +35,25 @@
>  struct dp_packet;
>  struct ds;
>  
> +enum flow_tnl_type {
> +        FLOW_TUNNEL_NONE = 0,
> +        FLOW_TUNNEL_IPV4,
> +        FLOW_TUNNEL_IPV6
> +};
> +
>  /* Tunnel information used in flow key and metadata. */  struct flow_tnl {
> -    ovs_be32 ip_dst;
> -    struct in6_addr ipv6_dst;
> -    ovs_be32 ip_src;
> -    struct in6_addr ipv6_src;
> +    enum flow_tnl_type tun_type;
> +    union flow_tnl_l3_hdr {
> +        struct tnl_ip_hdr {
> +            ovs_be32 ip_dst;
> +            ovs_be32 ip_src;
> +        } ip;
> +        struct tnl_ip6_hdr {
> +            struct in6_addr ipv6_dst;
> +            struct in6_addr ipv6_src;
> +        } ip6;
> +    } tnl_l3_hdr;
>      ovs_be64 tun_id;
>      uint16_t flags;
>      uint8_t ip_tos;
> @@ -76,10 +89,11 @@ struct flow_tnl {
> 
> This approach introduce changes in many modules such as sflow, ofproto and 
> tunnel config. So it definitely need good amount of testing though the 
> approach is much clean.
> 
> Proposal :2
> =========
> This approach also avoid memsetting the entire ipv6 address but uses a flag 
> inside the destination ipv6 address for checking the tunnel header is 
> valid/not.
> 
> diff --git a/lib/packets.h b/lib/packets.h index edf140b..4822c50 
> 100644
> --- a/lib/packets.h
> +++ b/lib/packets.h
> @@ -35,10 +35,16 @@
>  struct dp_packet;
>  struct ds;
>  
> +struct tnl_in6_addr {
> +    union {
> +        uint8_t u_s6_addr[16];
> +    } u;
> +    bool is_tnl_valid; //flag for check if tunnel header is valid/not 
> +};
>  /* Tunnel information used in flow key and metadata. */  struct flow_tnl {
>      ovs_be32 ip_dst;
> -    struct in6_addr ipv6_dst;
> +    struct tnl_in6_addr ipv6_dst;
>      ovs_be32 ip_src;
>      struct in6_addr ipv6_src;
>      ovs_be64 tun_id;
> 
> This approach have relatively less code change, but not looks very clean as 
> the first one. 
> 
> What do you think about these two approaches to avoid the overhead?? Please 
> let me know your comments/suggestions/possible implications if any.
> And also feel free to share if there is better way to handle this issue.
> 
> 
> Regards
> _Sugesh
> 

Hi, Sugesh.

Have you tested any of the approaches? What are the performance improvements? I 
would like to have IPv4-mapped addresses. But checking that against checking a 
single bit may be more expensive. But is it so much more expensive? Are there 
other ways to optimize it? Hence, I would like to see numbers on the proposals.
How do they improve your throughput? What about medium packets? How do they 
perform before and after this commit?

Thanks.
Cascardo.
_______________________________________________
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

Reply via email to