Neil, 

Nice patch! One question - what gcc versions did you try this out on? We'll 
round out with checking the other versions. 

Regards, 
-Venky

-----Original Message-----
From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Neil Horman
Sent: Thursday, July 24, 2014 11:24 AM
To: dev at dpdk.org
Subject: [dpdk-dev] [PATCH] ixgbe: convert sse intrinsics to use __builtin 
variants

The ixgbe pmd currently can't be built without enabling sse instructions at 
compile time.  While sse extensions provide better performance, theres no 
reason that we can't still create builds to run on systems that don't support 
sse.  If we modify the ixgbe code to use the __builtin_shuffle and 
__builtin_popcountll functions, I've confirmed that the gcc compiler emits the 
appropriate sse instructions when the provided -march parameter indicates a 
machine that includes sse support, and emits generic code when see isn't 
available.

Signed-off-by: Neil Horman <nhorman at tuxdriver.com>
CC: Thomas Monjalon <thomas.monjalon at 6wind.com>
---
 lib/librte_pmd_ixgbe/ixgbe_rxtx_vec.c | 12 +++++-------
 1 file changed, 5 insertions(+), 7 deletions(-)

diff --git a/lib/librte_pmd_ixgbe/ixgbe_rxtx_vec.c 
b/lib/librte_pmd_ixgbe/ixgbe_rxtx_vec.c
index 09e19a3..5747072 100644
--- a/lib/librte_pmd_ixgbe/ixgbe_rxtx_vec.c
+++ b/lib/librte_pmd_ixgbe/ixgbe_rxtx_vec.c
@@ -38,8 +38,6 @@
 #include "ixgbe_ethdev.h"
 #include "ixgbe_rxtx.h"

-#include <nmmintrin.h>
-
 #ifndef __INTEL_COMPILER
 #pragma GCC diagnostic ignored "-Wcast-qual"
 #endif
@@ -294,8 +292,8 @@ ixgbe_recv_pkts_vec(void *rx_queue, struct rte_mbuf 
**rx_pkts,
                rte_compiler_barrier();

                /* D.1 pkt 3,4 convert format from desc to pktmbuf */
-               pkt_mb4 = _mm_shuffle_epi8(descs[3], shuf_msk);
-               pkt_mb3 = _mm_shuffle_epi8(descs[2], shuf_msk);
+               pkt_mb4 = __builtin_shuffle(descs[3], shuf_msk);
+               pkt_mb3 = __builtin_shuffle(descs[2], shuf_msk);

                /* C.1 4=>2 filter staterr info only */
                sterr_tmp2 = _mm_unpackhi_epi32(descs[3], descs[2]); @@ -310,8 
+308,8 @@ ixgbe_recv_pkts_vec(void *rx_queue, struct rte_mbuf **rx_pkts,
                pkt_mb3 = _mm_add_epi16(pkt_mb3, in_port);

                /* D.1 pkt 1,2 convert format from desc to pktmbuf */
-               pkt_mb2 = _mm_shuffle_epi8(descs[1], shuf_msk);
-               pkt_mb1 = _mm_shuffle_epi8(descs[0], shuf_msk);
+               pkt_mb2 = __builtin_shuffle(descs[1], shuf_msk);
+               pkt_mb1 = __builtin_shuffle(descs[0], shuf_msk);

                /* C.2 get 4 pkts staterr value  */
                zero = _mm_xor_si128(dd_check, dd_check); @@ -338,7 +336,7 @@ 
ixgbe_recv_pkts_vec(void *rx_queue, struct rte_mbuf **rx_pkts,
                                pkt_mb1);

                /* C.4 calc avaialbe number of desc */
-               var = _mm_popcnt_u64(_mm_cvtsi128_si64(staterr));
+               var = __builtin_popcountll(_mm_cvtsi128_si64(staterr));
                nb_pkts_recd += var;
                if (likely(var != RTE_IXGBE_DESCS_PER_LOOP))
                        break;
--
1.8.3.1

Reply via email to