On 12/13/16 10:50, Savolainen, Petri (Nokia - FI/Espoo) wrote:
> 
>>> +static void add_group(sched_group_t *sched_group, int thr, int group)
>>> +{
>>> +   int num;
>>> +   uint32_t gen_cnt;
>>> +   thr_group_t *thr_group = &sched_group->s.thr[thr];
>>> +
>>> +   num = thr_group->num_group;
>>> +   thr_group->group[num] = group;
>>> +   thr_group->num_group  = num + 1;
>>> +   gen_cnt = odp_atomic_load_u32(&thr_group->gen_cnt);
>>> +   odp_atomic_store_u32(&thr_group->gen_cnt, gen_cnt + 1);
>>
>>
>> odp_atomic_inc_u32()
> 
> Atomic, in-memory increment would be waste here since this operation is done 
> while holding a lock. The variable is atomic since it's checked with load_acq 
> in sched_cmd() (when not holding a lock).
> 
> 
>>
>>
>>> +}
>>> +
>>> +static void remove_group(sched_group_t *sched_group, int thr, int
>> group)
>>> +{
>>> +   int i, num;
>>> +   int found = 0;
>>> +   thr_group_t *thr_group = &sched_group->s.thr[thr];
>>> +
>>> +   num = thr_group->num_group;
>>> +
>>> +   for (i = 0; i < num; i++) {
>>> +           if (thr_group->group[i] == group) {
>>> +                   found = 1;
>>> +
>>> +                   for (; i < num - 1; i++)
>>> +                           thr_group->group[i] = thr_group->group[i + 1];
>>> +
>>> +                   break;
>>> +           }
>>> +   }
>>> +
>>> +   if (found) {
>>> +           uint32_t gen_cnt;
>>> +
>>> +           thr_group->num_group = num - 1;
>>> +           gen_cnt = odp_atomic_load_u32(&thr_group->gen_cnt);
>>> +           odp_atomic_store_u32(&thr_group->gen_cnt, gen_cnt + 1);
>>
>> odp_atomic_inc_u32()
>>
>> and you can remove if found by placing inc and break above. That should
>> fit in 80 characters.
> 
> 
> Same thing here. Load + store is more performant than an unnecessary atomic, 
> in-memory inc.
> 
> -Petri
> 

I think it depends. On ABI compat mode that will be function and you
need 2 stores in stack.

And what is faster depends on implementation of internal
__atomic_fetch_add call, which may vary on gcc version. I never measured
what is faster, but it's easy to check...

Maxim.




> 

Reply via email to