13/11/2018 01:14, Stephen Hemminger: > On Mon, Nov 12, 2018, 4:01 PM Thomas Monjalon <tho...@monjalon.net wrote: > > > A bug was found when the inline function mlx5_tx_complete() > > is optimized with AVX512F instructions. It corrupts an offset > > in the instructions vmovdqu8 of the AVX2 version of rte_mov128(), > > used in rte_memcpy(), which is called in rte_mempool_put_bulk(). > > > > All the above functions are inline. So the workaround is > > to disable AVX512F optimization for the functions calling the > > top-level function of this call stack, i.e. mlx5_tx_complete(). > > All GCC versions supporting AVX512 are supposed to be affected. > > > > The root cause is not identified yet. It may be thought that > > more related bugs may happen in other functions. > > That's why the initial workaround was to disable AVX512F globally. > > This patch takes the risk of applying the workaround only for the > > functions known to be affected, in order to preserve the optimization > > everywhere else. > > > > Bugzilla ID: 97 > > Fixes: 8d07c82b239f ("mk: disable gcc AVX512F support") > > > > Signed-off-by: Thomas Monjalon <tho...@monjalon.net> > > The additional annotations clutter the code. > How big a performance hit is it to disable for whole driver? Or just use > memcpy instead of rte_memcpy?
rte_memcpy() is used via rte_mempool_put_bulk(). I am not going to change it to memcpy... About disabling AVX512F for the whole driver, the goal of this patch is to reduce the scope of the workaround. If a per-function scope is not chosen, then we can stay with a global safe scope. If you are interested to know more, the bugzilla has tons of infos: https://bugs.dpdk.org/show_bug.cgi?id=97 Given that we don't get much help on this major GCC bug, we are probably going to stay on the safe side. Anyway I must stop working (alone) on this bug, and instead, focus on making 18.11 out.