From: Christophe Milard [mailto:christophe.mil...@linaro.org] 
Sent: Friday, May 27, 2016 4:08 PM
To: Savolainen, Petri (Nokia - FI/Espoo) <petri.savolai...@nokia.com>
Cc: yi...@linaro.org; lng-odp@lists.linaro.org
Subject: Re: [lng-odp] [PATCHv3] helper: cleaner interface to 
odph_odpthreads_create/join



On 27 May 2016 at 09:56, Savolainen, Petri (Nokia - FI/Espoo) 
<petri.savolai...@nokia.com> wrote:


> -----Original Message-----
> From: lng-odp [mailto:lng-odp-boun...@lists.linaro.org] On Behalf Of
> Christophe Milard
> Sent: Friday, May 27, 2016 10:39 AM
> To: yi...@linaro.org; lng-odp@lists.linaro.org
> Subject: [lng-odp] [PATCHv3] helper: cleaner interface to
> odph_odpthreads_create/join
>
> The odph_odpthread_t (non opaque array of odpthread) is now replaced by a
> single type odph_odpthread_tbl_t abstracted as a void*.
> The table describing the odpthreads being created is now malloc'd and
> freed by the helper function themselves.
>


This change assumes that application will create all threads with a single call 
(which is not always practical or even possible). If application creates 
threads one by one, you'd end up with N separate thread table pointers, which 
application would need to store into another table anyway.

Correct: My approach let tha application decides how it wants to group the 
threads: If the threads share the same functions to run etc, the application 
can handle it in a single call (1 table of N threads). If the  threads cannot 
be stardes at the same time, tha application my take the N table if 1 entry 
instead.
 

It would be better to either define opaque odph_odpthread_t with correct size, 
or define alloc/free calls for a odph_odpthread_tbl_t.

odph_odpthread_tbl_t *thr_tbl;
int num = 5;
odp_cpumask_t cpumask;

thr_tbl = odph_odpthread_tbl_alloc(num);

odp_cpumask_zero(&cpumask);
odp_cpumask_set(&cpumask, 1);
odph_odpthreads_create(thr_tbl, &cpumask, ...);

...

odp_cpumask_zero(&cpumask);
odp_cpumask_set(&cpumask, 2);
odph_odpthreads_create(thr_tbl, &cpumask, ...);

...

odp_cpumask_zero(&cpumask);
odp_cpumask_set(&cpumask, 3);
odp_cpumask_set(&cpumask, 4);
odp_cpumask_set(&cpumask, 5);
odph_odpthreads_create(thr_tbl, &cpumask, ...);

hmmm... what would happen at join time? would the application have to provide a 
mask with the set of CPU threads to be join hence letting the responsibility of 
the join consistency to the application? or would all the threads in the table 
be joined (which is maybe not what the application wants)?...

What happens the day we say more than one thread can run on each cpu?
(the current approach can easily be modified by passing an array of N cpu 
number -possibly including the same cpu many times-, instead of the current 
mask at creation time.). How would you describe which threads should be started 
or joined in your case? I guess an array of odp_thread ids would have to be 
given as well... or?

I think the appraoch taken here is cleaner: each table is a group of threads 
which are created and joined at the same time. The helpers guarantees this 
consistency over such a group: the table size, the starting of all group 
threads and the joining of all group threads is done by the helper.
Yes, the application may have the need to create many such tables if it needs 
different groups of threads, but the consistency within each group is garanteed 
by the helper.
Besides, the need to have different tables for different group of threads does 
not necessarily make the application itself less clear: the pktio performance 
test, for instance, now defines two sets of threads: the TX threads and the RX 
threads. Is that  worse than having a single table of threads and letting the 
application try to distinguish them by some other means (the cpu  mask in your 
approach, as long as this works...) 

Your approach removes the need to have many tables within the application, 
indeed, but the consistency (size of table, number of started threads and 
number of joined threads) has to be maintained by the caller.
My approach lets the helper maintain the consistency within a table and force 
the application to separate different thread categories in different tables...

I still think my approach is more consistent. Hope I can convince you :-)

Have a nice week-end,

Christophe.



Of course, I had a bug in my example above and missed the index into the 
table... Logic is the same we already have in direct pthread/process calls 
(provides the same flexibility).

odph_odpthread_tbl_t *thr_tbl;
int num = 5;
odp_cpumask_t cpumask;

thr_tbl = odph_odpthread_tbl_alloc(num);

// create first
odp_cpumask_zero(&cpumask);
odp_cpumask_set(&cpumask, 1);
odph_odpthreads_create(&thr_tbl[0], &cpumask, ...);

...

// create second
odp_cpumask_zero(&cpumask);
odp_cpumask_set(&cpumask, 2);
odph_odpthreads_create(&thr_tbl[1], &cpumask, ...);

...

// create three more
odp_cpumask_zero(&cpumask);
odp_cpumask_set(&cpumask, 3);
odp_cpumask_set(&cpumask, 4);
odp_cpumask_set(&cpumask, 5);
odph_odpthreads_create(&thr_tbl[2], &cpumask, ...);

...

// wait for the first two
odph_odpthreads_join(&thr_tbl[0], 2);

// wait for the rest
odph_odpthreads_join(&thr_tbl[2], 3);

odph_odpthread_tbl_free(thr_tbl);


-Petri

_______________________________________________
lng-odp mailing list
lng-odp@lists.linaro.org
https://lists.linaro.org/mailman/listinfo/lng-odp

Reply via email to