Re: [lng-odp] process thread and mixed mode

Christophe Milard Mon, 16 May 2016 23:47:00 -0700

On 17 May 2016 at 02:09, Bill Fischofer <bill.fischo...@linaro.org> wrote:


>
>
> On Fri, May 13, 2016 at 6:32 AM, Savolainen, Petri (Nokia - FI/Espoo) <
> petri.savolai...@nokia.com> wrote:
>
>>
>>
>>
>>
>> *From:* Christophe Milard [mailto:christophe.mil...@linaro.org]
>> *Sent:* Friday, May 13, 2016 11:16 AM
>> *To:* Petri Savolainen <petri.savolai...@linaro.org>; Brian Brooks <
>> brian.bro...@linaro.org>; Bill Fischofer <bill.fischo...@linaro.org>;
>> Mike Holmes <mike.hol...@linaro.org>; LNG ODP Mailman List <
>> lng-odp@lists.linaro.org>; Anders Roxell <anders.rox...@linaro.org>
>> *Cc:* Tobias Lindquist <tobias.lindqu...@ericsson.com>
>> *Subject:* process thread and mixed mode
>>
>>
>>
>> Hi,
>>
>>
>>
>> Let me start in a bit unusual way :-). I do not like my "running things
>> in process mode" patch series  that much...
>>
>> But I do think it should be merged. With the mixed mode.
>>
>>
>>
>> I think mixed mode is good addition, but would need more options: e.g.
>> --odph-num-proc <num> --odph-num-pthread-per-proc <num> and thus can be
>> added later. Pure proc mode is enough to get things started.
>>
>>
>>
>> The main goal of the patch series is to push the work regarding a proper
>> and consistent definition of some ODP low level concepts and objects. The
>> patch series makes things fail, hence pushing for defining and coding
>> things better, I hope.  Removing "provocative" things like the mixed mode
>> from it is just defeating the purpose of this series: I'll be happy to
>> remove it (and plenty of other things) when we know which well documented
>> definition/rule makes this code not relevant.
>>
>> As long as we keep saying that odp threads could be indifferently  linux
>> processes or pthreads, I'll be thinking that this series should work,
>> inclusive any kind of mixed mode.
>>
>>
>>
>> However, processes and threads are different things in linux, and I
>> believe this distinction should be kept within ODP: Trying to make
>> processes behave as pthreads for some ODP objects (such as addresses for
>> shmem objects) does not sound like a good idea to me. Yes, this can maybe
>> be done with linux -and even possibly without preallocating the memory-,
>> but still, linux creates 2 distinct virtual address spaces on fork() so why
>> would ODP put any exception to that?.
>> ...And If we do want to create such exceptions (which I don't like),
>> shouln't ODP define its own concept of process and thread as it obviously
>> wouldn't match 100% the definition of the OS?
>>
>>
>>
>>
>>
>> A lot of application internal state data may be directly managed with OS.
>> But I think we cannot depend 100% on OS shared data (== remove shm API all
>> together), since HW/implementation could have multiple ways optimize
>> where/how the memory is allocated/mapped (chip internal vs. external
>> memory, huge page mapped vs. normal page mapped, IO-MMU mapped, etc) … or
>> when passing pointers to ODP API, the API spec may require that the memory
>> is allocated from SHM (application would not know if the API function is
>> implemented as HW or SW).
>>
>>
>>
>>
>>
>> I do nevertheless definitively agree with Petri when saying that that ODP
>> applications should be given a way to "share pointers" between odpthreads.
>> Taking the "process only" behavior forcing every concurrent execution tasks
>> to use tricky offset calculation to find their data is too nasty and
>> inefficient.
>>
>>
>>
>> I think this will boil down to 2 choices:
>>
>> 1) Take these object definitions from the OS and accept the OS rules.
>>
>> 2) Define our own ODP objects (and possibly defines our own rules)
>>
>>
>>
>> 1)Taking the OS definitions:
>>
>> Choice 1) is consistent as long as we accept the OS rules. Taking that
>> choice would mean that any address remain valid within a process only:
>> shmem pointers will not be able to be shared between different processes: I
>> do not see that as a big problem as the applications can use pthreads if it
>> needs to share the same VA, or can fork() at a conveniant time. Moreover,
>> existing applications which should be ported to ODP are probably OK as they
>> would already be using pthreads/processes in the correct way.
>>
>>
>>
>> A more tricky aspect of choice 1) is the development of ODP itself:
>> Operations involving many "ODPthreads" will have to cope with both
>> processes and threads possibly without a clear view of what is done by the
>> application. ( I guess in linux if getpid()==gettid() when calling
>> odp_init_local() one can assume safely that the app forked...?).
>> I guess it is doable: assuming all odpthreads are descendant of the
>> instanciation process, ODP can initially reserve shared memory in
>> odp_init_global() for its own purpose. It has a price: within ODP (and
>> anything external to the app itself, such as drivers) a common VA cannot be
>> assumed. ODP/drivers will have be prepared for an app fork at any time and
>> hence do its internal stuff as if pointers could not be shared (except
>> within the initially allocated memory). This will cost ODP performance.
>>
>>
>>
>> Taking choice 1 probably also mean that shmen should no longer be part of
>> ODP: Allocating memory is as much an OS thing as forking a process... why
>> would shmem objects be ODP objects while ODP thread would not be?
>>
>>
>>
>>
>>
>> Memory is a HW resource (such as e.g. a packet interface), and may have
>> different performance/properties depending on SoCs (cpu local memory, chip
>> internal memory, memory block carved out from a shared cache, DDR not
>> shared with other NUMA nodes, DDR shared with other NUMA nodes, persistent
>> memory, …). By default, operating system would give you access to the
>> common external memory. SKDs typically have some way to manage shared
>> memory.
>>
>>
>>
>> Thread of execution is defined by the OS and language, rather than the
>> SoC.
>>
>>
>>
>> An implementation may need additional parameters in global_init
>> (implementation specific or standard), or as command line options to
>> prepare for a specific mode, max number of procs, max memory usage, etc.
>>
>>
>>
>
> I think the fundamental problem here is that we've got an ODP concept for
> thread of execution (odp_thread_t) but do not have any corresponding
> concept of address space. The result is that the odp_thread_t type is being
> stretched too thin. Would it not make more sense to introduce an
> odp_process_t type that represents the address space that one or more
> threads execute in?
>

I fully (110%!) agree that ODP has to be aware of address spaces!

It then becomes straightforward to say that while ODP handles have ODP
> instance scope, addresses derived from those handles have ODP process
> scope. Thus threads can freely share addresses derived from ODP handles
> only with other threads belonging to the same ODP process.
>

I am not sure I fully understand how you see this: If odp_thread_t and
odp_process_t remain OS things, how would this work? I mean, once the ODP
app  forks it would call an ODP_create_process() instead of an
ODP_thread_create()? And on join or wait, it would call odp_term_local()
with both an odp_thread_t and odp_process_t as parameter to make sure ODP
can figure out what's going on? I would like to see a clear scenario here.
The other alternative is to have ODP wrappers around fork()
pthread_create(), wait() and join(). then ODP has complete control and the
application does not need to keep the OS and ODP "in sync".
This allow our one definition of these objects if we really wants to have
them different from their linux conterpart.


> In Monarch we support a single (default) process and all ODP threads run
> in this single process.  For Tiger Moth we'd expect to support multiple
> processes, each of which can contain one or more ODP threads. We can leave
> the process creation mechanism to be specified by the implementation, just
> as we do threads today, but this would solve the semantic confusion we
> currently have between threads and address spaces.
>
> While ODP itself would support the concept of processes, each
> implementation can determine the number of processes (address spaces) that
> are supported for a given ODP implementation, and applications would query
> and respond to this limit just as they do today with the thread limit. We'd
> probably extend the thread limit to include both an absolute number of
> threads as well as the max supported in each process.
>

This is really saying we are wrapping fork() and pthread_create() isn't it?
why would ODP care about OS limit otherwise?


>
> The ODP shm construct thus becomes a special object that has an
> addressability scope that can span ODP processes.
>

Don't agree:  if the definition of address space remains an OS thing (e.g.
process), ODP should obey OS rules.


> However, we'd expect that each implementation can impose its own rules as
> to how this is managed. For example, on platforms that follow the Linux
> process model this means that shm's need to be created prior to process
> creation because processes are created via the fork() mechanism that
> ensures that shm's in the parent process are visible at the same virtual
> address in its children.
>

Aren't you saying the opposite of the previous sentence here?. I read this
as if we should obey OS rules...


Christophe


> Other platforms that have more sophisticated virtual addressing
> capabilities (e.g., many mainframe systems) may behave differently. On such
> platforms a "pointer" is actually an address space identifier and an offset
> pair, and the HW can efficiently perform virtual address translations on
> multiple address spaces simultaneously. So in this case a shm is simply a
> separate address space that consists only of data rather than code with one
> or more threads of execution running in it.
>
>
>>
>>
>> 2) Defining new ODP objects
>>
>> I guess in this case, we'd have to create both ODP threads and ODP
>> processes objects and related usual methods. The advantage if this approach
>> is that ODP has more control of what is being done. It also allows for a
>> ODP definition of things (such as the above mentioned exception), if we
>> really want to...
>>
>> It makes applications more portable (if ODP abstracts some of the common
>> OS functionality, the apps using these ODP api becomes OS agnostic).
>>
>> As we already have done for shmem, ODP methods for odp threads and
>> processes could be provided to retrieve underlying OS stuff (such as PID...)
>>
>>
>> The problem of having to assume non common VA within ODP remains.
>>
>>
>>
>> ... and of course where should we stop wrapping the OS...?
>>
>>
>>
>> 3) There is a third alternative as well: stating that ODP is thread only
>> :-). No process, no problem?
>>
>>
>>
>> ODP needs to run also as processes, since applications are built from
>> many different libraries and legacy code which may require process model.
>> Maybe all implementations would not support it, but major ones should.
>>
>>
>>
>> -Petri
>>
>>
>>
>>
>>
>>
>>
>> Thank you for reading that far :-). waiting for your comments...
>>
>>
>>
>> Christophe.
>>
>
>
_______________________________________________
lng-odp mailing list
lng-odp@lists.linaro.org
https://lists.linaro.org/mailman/listinfo/lng-odp

Re: [lng-odp] process thread and mixed mode

Reply via email to