Re: [lng-odp] process thread and mixed mode

Christophe Milard Wed, 18 May 2016 00:27:09 -0700

On 17 May 2016 at 23:19, Bill Fischofer <bill.fischo...@linaro.org> wrote:


>
> Just to summarize the discussion from earlier today with a view towards
> how we move forward:
>
>    - Because today we have threads that all share a single address space,
>    ODP handles and addresses derived from them may be shared across the entire
>    ODP instance and all threads running in it.
>
> Yes, this is my understanding.

>
>    - It's agreed that such free sharing makes life easy for application
>    writers and would be desirable even when multiple processes are involved.
>
> If this means that the application is given a way to share a common
address space (e.g. use pthread or allocate ressources it needs to share
addresses on before fork), then yes.


>
>    - The straightforward way of preserving this behavior when threads may
>    run as processes is for the application to communicate as part of
>    odp_init_global() (via the odp_init_t parameters) the maximum amount of
>    shared memory it will use.
>
> No. this is not my understanding: As I understood yesterday, we dropped
the exception for shmem objects that was disscussed earlyer (which I think
is good). What we now say, as far as I understood, is that we go for choice
1) depicted at the starting mail of this thread, meaning:
Addresses returned by the odp api (and any other addresses) have the scope
defined by the OS and this applies to shmem addresses as well. If an
application wants to share shmem addresses (or any others) between ODP
threads, these odp threads will have to be -assuming linux- either pthreads
or processes which forked after the shmem_reserve() (or the allocation of
whatever the pointer points to). As a result, there is no need to know the
total amount of memory the application may need in advance.

But:
The application will expect ODP objects (referenced by handles) to be
working across processes as it should remain unaware of their internal
implementation. For instance:
A process 'A' may fork 2 processes 'B' and 'C'. B may create a pktio whose
handle is sent (using some kind of IPC) to C. My understanding is that the
application is allowed to send/receive packets to the pktio from/to B and
C. And it should work.
This means that the pktio (within ODP) must be coded is a way that enables
using it from different processes (even forked before the pktio creation).
To achieve this, the pktio should either assume 2 completely different
address spaces and cope with it (using offsets, IPC...) or/and be given a
way to allocate shared memory guaranteed to be mapped at the same address.
The second approach enables better performance.
This is what I tried to say when talking about the 2 kinds of shared memory
I could foresee: The usual odp_shmem_reserve() for the application and a
_odp_internal_shmem_reserve() (you will probably find a better name :-) ),
not visible on the API, to reserve shared memory guaranteed to be mapped at
the same address on all processes. The latter function would only be used
by ODP itself, and may take its memory from a preallocated pool (I am not
sure this is even needed on linux, but it might on other OSes). The size of
this pool should be far less than the total memory size the app is expected
to use. The memory allocated from this pool is for ODP internal usage only.

>
>    - When ODP sees this, at odp_init_global() time it will reserve a
>    single giant block of shared memory (possibly gigabytes in size) that will
>    be used to satisfy all odp_shm_reserve() calls issued by this ODP instance.
>    If the application underestimates its needs here, then the result will be
>    an odp_shm_reserve() failure when this global shared memory block is
>    exhausted, even if the underlying system still has more memory available.
>
>
No, this is not my understanding. hopefully the text above made it
clearer...

>
>    - With this convention, as long as processes are created after
>    odp_init_global() is called, all ODP handles and addresses derived from
>    them will be sharable across the entire ODP instance as before even if
>    threads are executing in (child) processes created via an underlying
>    linux-style fork() operation. The downside is we may have to revisit the
>    current "flags" argument to odp_shm_reserve() or else have multiple
>    pre-reserved blocks, one for each valid flags value.
>
>  No, this is not my understanding. hopefully the text above made it
clearer...

>
>    - The alternative is to allow new shared memory objects to be created
>    after process creation with the understanding that these objects will not
>    be sharable across process boundaries.
>
> I thought this is what we said.


> Tiger Moth could support either of these conventions, however it would
> seem the first one (single reserve during odp_global_init()) would be the
> most compatible with where we are today and potentially be the more
> convenient model for current and future ODP applications.
>

I am afraid there will be a rather large work revisiting all code for all
current ODP objects to allow process mode. There may be unpleasant grey
zones as well, or at least places which would deserve clear docs: for
instance packet address will be valid based on their pool address scope: 2
processes can expect a packet address to be sharable as long as its *pool*
is created before the fork (not the packet itself). This comes as a
consequence from the fact that memory is actually allocated at pool
creation rather than packet creation. Maybe some ODP implementations will
allocate the packet at packet creation time, which may create
inconsistencies between ODP implementations...

I am afraid we are just seeing the top of the iceberg at this stage, but
this has to be clarified now, before ODP grows even more.

Christophe.

>
>
> On Tue, May 17, 2016 at 1:45 AM, Christophe Milard <
> christophe.mil...@linaro.org> wrote:
>
>>
>>
>> On 17 May 2016 at 02:09, Bill Fischofer <bill.fischo...@linaro.org>
>> wrote:
>>
>>>
>>>
>>> On Fri, May 13, 2016 at 6:32 AM, Savolainen, Petri (Nokia - FI/Espoo) <
>>> petri.savolai...@nokia.com> wrote:
>>>
>>>>
>>>>
>>>>
>>>>
>>>> *From:* Christophe Milard [mailto:christophe.mil...@linaro.org]
>>>> *Sent:* Friday, May 13, 2016 11:16 AM
>>>> *To:* Petri Savolainen <petri.savolai...@linaro.org>; Brian Brooks <
>>>> brian.bro...@linaro.org>; Bill Fischofer <bill.fischo...@linaro.org>;
>>>> Mike Holmes <mike.hol...@linaro.org>; LNG ODP Mailman List <
>>>> lng-odp@lists.linaro.org>; Anders Roxell <anders.rox...@linaro.org>
>>>> *Cc:* Tobias Lindquist <tobias.lindqu...@ericsson.com>
>>>> *Subject:* process thread and mixed mode
>>>>
>>>>
>>>>
>>>> Hi,
>>>>
>>>>
>>>>
>>>> Let me start in a bit unusual way :-). I do not like my "running things
>>>> in process mode" patch series  that much...
>>>>
>>>> But I do think it should be merged. With the mixed mode.
>>>>
>>>>
>>>>
>>>> I think mixed mode is good addition, but would need more options: e.g.
>>>> --odph-num-proc <num> --odph-num-pthread-per-proc <num> and thus can be
>>>> added later. Pure proc mode is enough to get things started.
>>>>
>>>>
>>>>
>>>> The main goal of the patch series is to push the work regarding a
>>>> proper and consistent definition of some ODP low level concepts and
>>>> objects. The patch series makes things fail, hence pushing for defining and
>>>> coding things better, I hope.  Removing "provocative" things like the mixed
>>>> mode from it is just defeating the purpose of this series: I'll be happy to
>>>> remove it (and plenty of other things) when we know which well documented
>>>> definition/rule makes this code not relevant.
>>>>
>>>> As long as we keep saying that odp threads could be indifferently
>>>>  linux processes or pthreads, I'll be thinking that this series should
>>>> work, inclusive any kind of mixed mode.
>>>>
>>>>
>>>>
>>>> However, processes and threads are different things in linux, and I
>>>> believe this distinction should be kept within ODP: Trying to make
>>>> processes behave as pthreads for some ODP objects (such as addresses for
>>>> shmem objects) does not sound like a good idea to me. Yes, this can maybe
>>>> be done with linux -and even possibly without preallocating the memory-,
>>>> but still, linux creates 2 distinct virtual address spaces on fork() so why
>>>> would ODP put any exception to that?.
>>>> ...And If we do want to create such exceptions (which I don't like),
>>>> shouln't ODP define its own concept of process and thread as it obviously
>>>> wouldn't match 100% the definition of the OS?
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> A lot of application internal state data may be directly managed with
>>>> OS. But I think we cannot depend 100% on OS shared data (== remove shm API
>>>> all together), since HW/implementation could have multiple ways optimize
>>>> where/how the memory is allocated/mapped (chip internal vs. external
>>>> memory, huge page mapped vs. normal page mapped, IO-MMU mapped, etc) … or
>>>> when passing pointers to ODP API, the API spec may require that the memory
>>>> is allocated from SHM (application would not know if the API function is
>>>> implemented as HW or SW).
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> I do nevertheless definitively agree with Petri when saying that that
>>>> ODP applications should be given a way to "share pointers" between
>>>> odpthreads. Taking the "process only" behavior forcing every concurrent
>>>> execution tasks to use tricky offset calculation to find their data is too
>>>> nasty and inefficient.
>>>>
>>>>
>>>>
>>>> I think this will boil down to 2 choices:
>>>>
>>>> 1) Take these object definitions from the OS and accept the OS rules.
>>>>
>>>> 2) Define our own ODP objects (and possibly defines our own rules)
>>>>
>>>>
>>>>
>>>> 1)Taking the OS definitions:
>>>>
>>>> Choice 1) is consistent as long as we accept the OS rules. Taking that
>>>> choice would mean that any address remain valid within a process only:
>>>> shmem pointers will not be able to be shared between different processes: I
>>>> do not see that as a big problem as the applications can use pthreads if it
>>>> needs to share the same VA, or can fork() at a conveniant time. Moreover,
>>>> existing applications which should be ported to ODP are probably OK as they
>>>> would already be using pthreads/processes in the correct way.
>>>>
>>>>
>>>>
>>>> A more tricky aspect of choice 1) is the development of ODP itself:
>>>> Operations involving many "ODPthreads" will have to cope with both
>>>> processes and threads possibly without a clear view of what is done by the
>>>> application. ( I guess in linux if getpid()==gettid() when calling
>>>> odp_init_local() one can assume safely that the app forked...?).
>>>> I guess it is doable: assuming all odpthreads are descendant of the
>>>> instanciation process, ODP can initially reserve shared memory in
>>>> odp_init_global() for its own purpose. It has a price: within ODP (and
>>>> anything external to the app itself, such as drivers) a common VA cannot be
>>>> assumed. ODP/drivers will have be prepared for an app fork at any time and
>>>> hence do its internal stuff as if pointers could not be shared (except
>>>> within the initially allocated memory). This will cost ODP performance.
>>>>
>>>>
>>>>
>>>> Taking choice 1 probably also mean that shmen should no longer be part
>>>> of ODP: Allocating memory is as much an OS thing as forking a process...
>>>> why would shmem objects be ODP objects while ODP thread would not be?
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> Memory is a HW resource (such as e.g. a packet interface), and may have
>>>> different performance/properties depending on SoCs (cpu local memory, chip
>>>> internal memory, memory block carved out from a shared cache, DDR not
>>>> shared with other NUMA nodes, DDR shared with other NUMA nodes, persistent
>>>> memory, …). By default, operating system would give you access to the
>>>> common external memory. SKDs typically have some way to manage shared
>>>> memory.
>>>>
>>>>
>>>>
>>>> Thread of execution is defined by the OS and language, rather than the
>>>> SoC.
>>>>
>>>>
>>>>
>>>> An implementation may need additional parameters in global_init
>>>> (implementation specific or standard), or as command line options to
>>>> prepare for a specific mode, max number of procs, max memory usage, etc.
>>>>
>>>>
>>>>
>>>
>>> I think the fundamental problem here is that we've got an ODP concept
>>> for thread of execution (odp_thread_t) but do not have any corresponding
>>> concept of address space. The result is that the odp_thread_t type is being
>>> stretched too thin. Would it not make more sense to introduce an
>>> odp_process_t type that represents the address space that one or more
>>> threads execute in?
>>>
>>
>> I fully (110%!) agree that ODP has to be aware of address spaces!
>>
>> It then becomes straightforward to say that while ODP handles have ODP
>>> instance scope, addresses derived from those handles have ODP process
>>> scope. Thus threads can freely share addresses derived from ODP handles
>>> only with other threads belonging to the same ODP process.
>>>
>>
>> I am not sure I fully understand how you see this: If odp_thread_t and
>> odp_process_t remain OS things, how would this work? I mean, once the ODP
>> app  forks it would call an ODP_create_process() instead of an
>> ODP_thread_create()? And on join or wait, it would call odp_term_local()
>> with both an odp_thread_t and odp_process_t as parameter to make sure ODP
>> can figure out what's going on? I would like to see a clear scenario here.
>> The other alternative is to have ODP wrappers around fork()
>> pthread_create(), wait() and join(). then ODP has complete control and the
>> application does not need to keep the OS and ODP "in sync".
>> This allow our one definition of these objects if we really wants to have
>> them different from their linux conterpart.
>>
>>
>>> In Monarch we support a single (default) process and all ODP threads run
>>> in this single process.  For Tiger Moth we'd expect to support multiple
>>> processes, each of which can contain one or more ODP threads. We can leave
>>> the process creation mechanism to be specified by the implementation, just
>>> as we do threads today, but this would solve the semantic confusion we
>>> currently have between threads and address spaces.
>>>
>>> While ODP itself would support the concept of processes, each
>>> implementation can determine the number of processes (address spaces) that
>>> are supported for a given ODP implementation, and applications would query
>>> and respond to this limit just as they do today with the thread limit. We'd
>>> probably extend the thread limit to include both an absolute number of
>>> threads as well as the max supported in each process.
>>>
>>
>> This is really saying we are wrapping fork() and pthread_create() isn't
>> it? why would ODP care about OS limit otherwise?
>>
>>
>>>
>>> The ODP shm construct thus becomes a special object that has an
>>> addressability scope that can span ODP processes.
>>>
>>
>> Don't agree:  if the definition of address space remains an OS thing
>> (e.g. process), ODP should obey OS rules.
>>
>>
>>> However, we'd expect that each implementation can impose its own rules
>>> as to how this is managed. For example, on platforms that follow the Linux
>>> process model this means that shm's need to be created prior to process
>>> creation because processes are created via the fork() mechanism that
>>> ensures that shm's in the parent process are visible at the same virtual
>>> address in its children.
>>>
>>
>> Aren't you saying the opposite of the previous sentence here?. I read
>> this as if we should obey OS rules...
>>
>>
>> Christophe
>>
>>
>>> Other platforms that have more sophisticated virtual addressing
>>> capabilities (e.g., many mainframe systems) may behave differently. On such
>>> platforms a "pointer" is actually an address space identifier and an offset
>>> pair, and the HW can efficiently perform virtual address translations on
>>> multiple address spaces simultaneously. So in this case a shm is simply a
>>> separate address space that consists only of data rather than code with one
>>> or more threads of execution running in it.
>>>
>>>
>>>>
>>>>
>>>> 2) Defining new ODP objects
>>>>
>>>> I guess in this case, we'd have to create both ODP threads and ODP
>>>> processes objects and related usual methods. The advantage if this approach
>>>> is that ODP has more control of what is being done. It also allows for a
>>>> ODP definition of things (such as the above mentioned exception), if we
>>>> really want to...
>>>>
>>>> It makes applications more portable (if ODP abstracts some of the
>>>> common OS functionality, the apps using these ODP api becomes OS agnostic).
>>>>
>>>> As we already have done for shmem, ODP methods for odp threads and
>>>> processes could be provided to retrieve underlying OS stuff (such as 
>>>> PID...)
>>>>
>>>>
>>>> The problem of having to assume non common VA within ODP remains.
>>>>
>>>>
>>>>
>>>> ... and of course where should we stop wrapping the OS...?
>>>>
>>>>
>>>>
>>>> 3) There is a third alternative as well: stating that ODP is thread
>>>> only :-). No process, no problem?
>>>>
>>>>
>>>>
>>>> ODP needs to run also as processes, since applications are built from
>>>> many different libraries and legacy code which may require process model.
>>>> Maybe all implementations would not support it, but major ones should.
>>>>
>>>>
>>>>
>>>> -Petri
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> Thank you for reading that far :-). waiting for your comments...
>>>>
>>>>
>>>>
>>>> Christophe.
>>>>
>>>
>>>
>>
>
_______________________________________________
lng-odp mailing list
lng-odp@lists.linaro.org
https://lists.linaro.org/mailman/listinfo/lng-odp

Re: [lng-odp] process thread and mixed mode

Reply via email to