Re: [dpdk-dev] [PATCH 2/4] hash: add memory ordering to avoid race conditions

Wang, Yipeng1 Mon, 01 Oct 2018 15:44:17 -0700

>-----Original Message-----
>From: Honnappa Nagarahalli [mailto:honnappa.nagaraha...@arm.com]
>Sent: Sunday, September 30, 2018 3:21 PM
>To: Wang, Yipeng1 <yipeng1.w...@intel.com>; Richardson, Bruce 
><bruce.richard...@intel.com>; De Lara Guarch, Pablo
><pablo.de.lara.gua...@intel.com>
>Cc: dev@dpdk.org; Gavin Hu (Arm Technology China) <gavin...@arm.com>; Steve 
>Capper <steve.cap...@arm.com>; Ola Liljedahl
><ola.liljed...@arm.com>; nd <n...@arm.com>
>Subject: RE: [dpdk-dev] [PATCH 2/4] hash: add memory ordering to avoid race 
>conditions
>
>>
>> Some general comments for  the various __atomic_store/load added,
>>
>> 1. Although it passes the compiler check, but I just want to confirm that if 
>> we
>> should use GCC/clang builtins, or if There are higher level APIs in DPDK to 
>> do
>> atomic operations?
>>
>I have used gcc builtins (just like rte_ring does)
[Wang, Yipeng] I checked rte_ring, it also has a specific header for C11, since 
it is a C11
standard, do we need something similar here? 
>
>> 2. We believe compiler will translate the atomic_store/load to regular MOV
>> instruction on Total Store Order architecture (e.g. X86_64). But we run the
>> perf test on x86 and here is the relative slowdown on lookup comparing to
>> master head. I am not sure if the performance drop comes from the atomic
>> buitins.
>>
>C11 atomics also block compiler reordering. Other than this, the retry loop is 
>an addition to lookup.
>The patch also has the alignment corrected. I am not sure how is that 
>affecting the perf numbers.
>
>> Keysize | single lookup | bulk lookup
>> 4 | 0.93 | 0.95
>> 8 | 0.95 | 0.96
>> 16 | 0.97 | 0.96
>> 32 | 0.97 | 1.00
>> 48 | 1.03 | 0.99
>> 64 | 1.04 | 0.98
>> 9 | 0.91 | 0.96
>> 13 | 0.97 | 0.98
>> 37 | 1.04 | 1.03
>> 40 | 1.02 | 0.98
>>
>I assume this is the data from the test cases in test_hash_perf.c file. I 
>tried to reproduce this data, but my data is worse. Can you
>specify the actual test from test_hash_perf.c you are using (With 
>locks/Pre-computed hash/With data/Elements in primary)?
>IMO, the differences you have provided are not high.


[Wang, Yipeng] I remember the performance data I used is the no-lock, without 
hash, with 8-byte data, in both primary and secondary.
I compared the master head  to the one with your first two commits. 
>
>> [Wang, Yipeng] I think even for current code, we need to check empty_slot.
>> Could you export this as a bug fix commit?
>>
>In the existing code, there is check 'if (!!key_idx & !rte_hash....)'. Are you 
>referring to '!!key_idx'? I think this should be changed to
>'(key_idx != EMPTY_SLOT)'.
[Wang, Yipeng] Yeah, I guess I did not see that part. Then I guess it is no 
need to export as a bug fix for now since it is not a functional issue.
Your change is good.

Re: [dpdk-dev] [PATCH 2/4] hash: add memory ordering to avoid race conditions

Reply via email to