Re: [lng-odp] process thread and mixed mode

Bill Fischofer Wed, 18 May 2016 21:08:00 -0700

I've submitted a patch to the User Guide to summarize what I believe we've
agreed to over the past several discussion sessions. I admit it doesn't
look nearly as good when explicitly stated.  Please review and I'll work to
refine it.


On Wed, May 18, 2016 at 2:25 AM, Christophe Milard <
christophe.mil...@linaro.org> wrote:

>
>
> On 17 May 2016 at 23:19, Bill Fischofer <bill.fischo...@linaro.org> wrote:
>
>>
>> Just to summarize the discussion from earlier today with a view towards
>> how we move forward:
>>
>>    - Because today we have threads that all share a single address
>>    space, ODP handles and addresses derived from them may be shared across 
>> the
>>    entire ODP instance and all threads running in it.
>>
>> Yes, this is my understanding.
>
>>
>>    - It's agreed that such free sharing makes life easy for application
>>    writers and would be desirable even when multiple processes are involved.
>>
>> If this means that the application is given a way to share a common
> address space (e.g. use pthread or allocate ressources it needs to share
> addresses on before fork), then yes.
>
>
>>
>>    - The straightforward way of preserving this behavior when threads
>>    may run as processes is for the application to communicate as part of
>>    odp_init_global() (via the odp_init_t parameters) the maximum amount of
>>    shared memory it will use.
>>
>> No. this is not my understanding: As I understood yesterday, we dropped
> the exception for shmem objects that was disscussed earlyer (which I think
> is good). What we now say, as far as I understood, is that we go for choice
> 1) depicted at the starting mail of this thread, meaning:
> Addresses returned by the odp api (and any other addresses) have the scope
> defined by the OS and this applies to shmem addresses as well. If an
> application wants to share shmem addresses (or any others) between ODP
> threads, these odp threads will have to be -assuming linux- either pthreads
> or processes which forked after the shmem_reserve() (or the allocation of
> whatever the pointer points to). As a result, there is no need to know the
> total amount of memory the application may need in advance.
>
> But:
> The application will expect ODP objects (referenced by handles) to be
> working across processes as it should remain unaware of their internal
> implementation. For instance:
> A process 'A' may fork 2 processes 'B' and 'C'. B may create a pktio whose
> handle is sent (using some kind of IPC) to C. My understanding is that the
> application is allowed to send/receive packets to the pktio from/to B and
> C. And it should work.
> This means that the pktio (within ODP) must be coded is a way that enables
> using it from different processes (even forked before the pktio creation).
> To achieve this, the pktio should either assume 2 completely different
> address spaces and cope with it (using offsets, IPC...) or/and be given a
> way to allocate shared memory guaranteed to be mapped at the same address.
> The second approach enables better performance.
> This is what I tried to say when talking about the 2 kinds of shared
> memory I could foresee: The usual odp_shmem_reserve() for the application
> and a _odp_internal_shmem_reserve() (you will probably find a better name
> :-) ), not visible on the API, to reserve shared memory guaranteed to be
> mapped at the same address on all processes. The latter function would only
> be used by ODP itself, and may take its memory from a preallocated pool (I
> am not sure this is even needed on linux, but it might on other OSes). The
> size of this pool should be far less than the total memory size the app is
> expected to use. The memory allocated from this pool is for ODP internal
> usage only.
>
>>
>>    - When ODP sees this, at odp_init_global() time it will reserve a
>>    single giant block of shared memory (possibly gigabytes in size) that will
>>    be used to satisfy all odp_shm_reserve() calls issued by this ODP 
>> instance.
>>    If the application underestimates its needs here, then the result will be
>>    an odp_shm_reserve() failure when this global shared memory block is
>>    exhausted, even if the underlying system still has more memory available.
>>
>>
> No, this is not my understanding. hopefully the text above made it
> clearer...
>
>>
>>    - With this convention, as long as processes are created after
>>    odp_init_global() is called, all ODP handles and addresses derived from
>>    them will be sharable across the entire ODP instance as before even if
>>    threads are executing in (child) processes created via an underlying
>>    linux-style fork() operation. The downside is we may have to revisit the
>>    current "flags" argument to odp_shm_reserve() or else have multiple
>>    pre-reserved blocks, one for each valid flags value.
>>
>>  No, this is not my understanding. hopefully the text above made it
> clearer...
>
>>
>>    - The alternative is to allow new shared memory objects to be created
>>    after process creation with the understanding that these objects will not
>>    be sharable across process boundaries.
>>
>> I thought this is what we said.
>
>
>> Tiger Moth could support either of these conventions, however it would
>> seem the first one (single reserve during odp_global_init()) would be the
>> most compatible with where we are today and potentially be the more
>> convenient model for current and future ODP applications.
>>
>
> I am afraid there will be a rather large work revisiting all code for all
> current ODP objects to allow process mode. There may be unpleasant grey
> zones as well, or at least places which would deserve clear docs: for
> instance packet address will be valid based on their pool address scope: 2
> processes can expect a packet address to be sharable as long as its *pool*
> is created before the fork (not the packet itself). This comes as a
> consequence from the fact that memory is actually allocated at pool
> creation rather than packet creation. Maybe some ODP implementations will
> allocate the packet at packet creation time, which may create
> inconsistencies between ODP implementations...
>
> I am afraid we are just seeing the top of the iceberg at this stage, but
> this has to be clarified now, before ODP grows even more.
>
> Christophe.
>
>>
>>
>> On Tue, May 17, 2016 at 1:45 AM, Christophe Milard <
>> christophe.mil...@linaro.org> wrote:
>>
>>>
>>>
>>> On 17 May 2016 at 02:09, Bill Fischofer <bill.fischo...@linaro.org>
>>> wrote:
>>>
>>>>
>>>>
>>>> On Fri, May 13, 2016 at 6:32 AM, Savolainen, Petri (Nokia - FI/Espoo) <
>>>> petri.savolai...@nokia.com> wrote:
>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> *From:* Christophe Milard [mailto:christophe.mil...@linaro.org]
>>>>> *Sent:* Friday, May 13, 2016 11:16 AM
>>>>> *To:* Petri Savolainen <petri.savolai...@linaro.org>; Brian Brooks <
>>>>> brian.bro...@linaro.org>; Bill Fischofer <bill.fischo...@linaro.org>;
>>>>> Mike Holmes <mike.hol...@linaro.org>; LNG ODP Mailman List <
>>>>> lng-odp@lists.linaro.org>; Anders Roxell <anders.rox...@linaro.org>
>>>>> *Cc:* Tobias Lindquist <tobias.lindqu...@ericsson.com>
>>>>> *Subject:* process thread and mixed mode
>>>>>
>>>>>
>>>>>
>>>>> Hi,
>>>>>
>>>>>
>>>>>
>>>>> Let me start in a bit unusual way :-). I do not like my "running
>>>>> things in process mode" patch series  that much...
>>>>>
>>>>> But I do think it should be merged. With the mixed mode.
>>>>>
>>>>>
>>>>>
>>>>> I think mixed mode is good addition, but would need more options: e.g.
>>>>> --odph-num-proc <num> --odph-num-pthread-per-proc <num> and thus can be
>>>>> added later. Pure proc mode is enough to get things started.
>>>>>
>>>>>
>>>>>
>>>>> The main goal of the patch series is to push the work regarding a
>>>>> proper and consistent definition of some ODP low level concepts and
>>>>> objects. The patch series makes things fail, hence pushing for defining 
>>>>> and
>>>>> coding things better, I hope.  Removing "provocative" things like the 
>>>>> mixed
>>>>> mode from it is just defeating the purpose of this series: I'll be happy 
>>>>> to
>>>>> remove it (and plenty of other things) when we know which well documented
>>>>> definition/rule makes this code not relevant.
>>>>>
>>>>> As long as we keep saying that odp threads could be indifferently
>>>>>  linux processes or pthreads, I'll be thinking that this series should
>>>>> work, inclusive any kind of mixed mode.
>>>>>
>>>>>
>>>>>
>>>>> However, processes and threads are different things in linux, and I
>>>>> believe this distinction should be kept within ODP: Trying to make
>>>>> processes behave as pthreads for some ODP objects (such as addresses for
>>>>> shmem objects) does not sound like a good idea to me. Yes, this can maybe
>>>>> be done with linux -and even possibly without preallocating the memory-,
>>>>> but still, linux creates 2 distinct virtual address spaces on fork() so 
>>>>> why
>>>>> would ODP put any exception to that?.
>>>>> ...And If we do want to create such exceptions (which I don't like),
>>>>> shouln't ODP define its own concept of process and thread as it obviously
>>>>> wouldn't match 100% the definition of the OS?
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> A lot of application internal state data may be directly managed with
>>>>> OS. But I think we cannot depend 100% on OS shared data (== remove shm API
>>>>> all together), since HW/implementation could have multiple ways optimize
>>>>> where/how the memory is allocated/mapped (chip internal vs. external
>>>>> memory, huge page mapped vs. normal page mapped, IO-MMU mapped, etc) … or
>>>>> when passing pointers to ODP API, the API spec may require that the memory
>>>>> is allocated from SHM (application would not know if the API function is
>>>>> implemented as HW or SW).
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> I do nevertheless definitively agree with Petri when saying that that
>>>>> ODP applications should be given a way to "share pointers" between
>>>>> odpthreads. Taking the "process only" behavior forcing every concurrent
>>>>> execution tasks to use tricky offset calculation to find their data is too
>>>>> nasty and inefficient.
>>>>>
>>>>>
>>>>>
>>>>> I think this will boil down to 2 choices:
>>>>>
>>>>> 1) Take these object definitions from the OS and accept the OS rules.
>>>>>
>>>>> 2) Define our own ODP objects (and possibly defines our own rules)
>>>>>
>>>>>
>>>>>
>>>>> 1)Taking the OS definitions:
>>>>>
>>>>> Choice 1) is consistent as long as we accept the OS rules. Taking that
>>>>> choice would mean that any address remain valid within a process only:
>>>>> shmem pointers will not be able to be shared between different processes: 
>>>>> I
>>>>> do not see that as a big problem as the applications can use pthreads if 
>>>>> it
>>>>> needs to share the same VA, or can fork() at a conveniant time. Moreover,
>>>>> existing applications which should be ported to ODP are probably OK as 
>>>>> they
>>>>> would already be using pthreads/processes in the correct way.
>>>>>
>>>>>
>>>>>
>>>>> A more tricky aspect of choice 1) is the development of ODP itself:
>>>>> Operations involving many "ODPthreads" will have to cope with both
>>>>> processes and threads possibly without a clear view of what is done by the
>>>>> application. ( I guess in linux if getpid()==gettid() when calling
>>>>> odp_init_local() one can assume safely that the app forked...?).
>>>>> I guess it is doable: assuming all odpthreads are descendant of the
>>>>> instanciation process, ODP can initially reserve shared memory in
>>>>> odp_init_global() for its own purpose. It has a price: within ODP (and
>>>>> anything external to the app itself, such as drivers) a common VA cannot 
>>>>> be
>>>>> assumed. ODP/drivers will have be prepared for an app fork at any time and
>>>>> hence do its internal stuff as if pointers could not be shared (except
>>>>> within the initially allocated memory). This will cost ODP performance.
>>>>>
>>>>>
>>>>>
>>>>> Taking choice 1 probably also mean that shmen should no longer be part
>>>>> of ODP: Allocating memory is as much an OS thing as forking a process...
>>>>> why would shmem objects be ODP objects while ODP thread would not be?
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> Memory is a HW resource (such as e.g. a packet interface), and may
>>>>> have different performance/properties depending on SoCs (cpu local memory,
>>>>> chip internal memory, memory block carved out from a shared cache, DDR not
>>>>> shared with other NUMA nodes, DDR shared with other NUMA nodes, persistent
>>>>> memory, …). By default, operating system would give you access to the
>>>>> common external memory. SKDs typically have some way to manage shared
>>>>> memory.
>>>>>
>>>>>
>>>>>
>>>>> Thread of execution is defined by the OS and language, rather than the
>>>>> SoC.
>>>>>
>>>>>
>>>>>
>>>>> An implementation may need additional parameters in global_init
>>>>> (implementation specific or standard), or as command line options to
>>>>> prepare for a specific mode, max number of procs, max memory usage, etc.
>>>>>
>>>>>
>>>>>
>>>>
>>>> I think the fundamental problem here is that we've got an ODP concept
>>>> for thread of execution (odp_thread_t) but do not have any corresponding
>>>> concept of address space. The result is that the odp_thread_t type is being
>>>> stretched too thin. Would it not make more sense to introduce an
>>>> odp_process_t type that represents the address space that one or more
>>>> threads execute in?
>>>>
>>>
>>> I fully (110%!) agree that ODP has to be aware of address spaces!
>>>
>>> It then becomes straightforward to say that while ODP handles have ODP
>>>> instance scope, addresses derived from those handles have ODP process
>>>> scope. Thus threads can freely share addresses derived from ODP handles
>>>> only with other threads belonging to the same ODP process.
>>>>
>>>
>>> I am not sure I fully understand how you see this: If odp_thread_t and
>>> odp_process_t remain OS things, how would this work? I mean, once the ODP
>>> app  forks it would call an ODP_create_process() instead of an
>>> ODP_thread_create()? And on join or wait, it would call odp_term_local()
>>> with both an odp_thread_t and odp_process_t as parameter to make sure ODP
>>> can figure out what's going on? I would like to see a clear scenario here.
>>> The other alternative is to have ODP wrappers around fork()
>>> pthread_create(), wait() and join(). then ODP has complete control and the
>>> application does not need to keep the OS and ODP "in sync".
>>> This allow our one definition of these objects if we really wants to
>>> have them different from their linux conterpart.
>>>
>>>
>>>> In Monarch we support a single (default) process and all ODP threads
>>>> run in this single process.  For Tiger Moth we'd expect to support multiple
>>>> processes, each of which can contain one or more ODP threads. We can leave
>>>> the process creation mechanism to be specified by the implementation, just
>>>> as we do threads today, but this would solve the semantic confusion we
>>>> currently have between threads and address spaces.
>>>>
>>>> While ODP itself would support the concept of processes, each
>>>> implementation can determine the number of processes (address spaces) that
>>>> are supported for a given ODP implementation, and applications would query
>>>> and respond to this limit just as they do today with the thread limit. We'd
>>>> probably extend the thread limit to include both an absolute number of
>>>> threads as well as the max supported in each process.
>>>>
>>>
>>> This is really saying we are wrapping fork() and pthread_create() isn't
>>> it? why would ODP care about OS limit otherwise?
>>>
>>>
>>>>
>>>> The ODP shm construct thus becomes a special object that has an
>>>> addressability scope that can span ODP processes.
>>>>
>>>
>>> Don't agree:  if the definition of address space remains an OS thing
>>> (e.g. process), ODP should obey OS rules.
>>>
>>>
>>>> However, we'd expect that each implementation can impose its own rules
>>>> as to how this is managed. For example, on platforms that follow the Linux
>>>> process model this means that shm's need to be created prior to process
>>>> creation because processes are created via the fork() mechanism that
>>>> ensures that shm's in the parent process are visible at the same virtual
>>>> address in its children.
>>>>
>>>
>>> Aren't you saying the opposite of the previous sentence here?. I read
>>> this as if we should obey OS rules...
>>>
>>>
>>> Christophe
>>>
>>>
>>>> Other platforms that have more sophisticated virtual addressing
>>>> capabilities (e.g., many mainframe systems) may behave differently. On such
>>>> platforms a "pointer" is actually an address space identifier and an offset
>>>> pair, and the HW can efficiently perform virtual address translations on
>>>> multiple address spaces simultaneously. So in this case a shm is simply a
>>>> separate address space that consists only of data rather than code with one
>>>> or more threads of execution running in it.
>>>>
>>>>
>>>>>
>>>>>
>>>>> 2) Defining new ODP objects
>>>>>
>>>>> I guess in this case, we'd have to create both ODP threads and ODP
>>>>> processes objects and related usual methods. The advantage if this 
>>>>> approach
>>>>> is that ODP has more control of what is being done. It also allows for a
>>>>> ODP definition of things (such as the above mentioned exception), if we
>>>>> really want to...
>>>>>
>>>>> It makes applications more portable (if ODP abstracts some of the
>>>>> common OS functionality, the apps using these ODP api becomes OS 
>>>>> agnostic).
>>>>>
>>>>> As we already have done for shmem, ODP methods for odp threads and
>>>>> processes could be provided to retrieve underlying OS stuff (such as 
>>>>> PID...)
>>>>>
>>>>>
>>>>> The problem of having to assume non common VA within ODP remains.
>>>>>
>>>>>
>>>>>
>>>>> ... and of course where should we stop wrapping the OS...?
>>>>>
>>>>>
>>>>>
>>>>> 3) There is a third alternative as well: stating that ODP is thread
>>>>> only :-). No process, no problem?
>>>>>
>>>>>
>>>>>
>>>>> ODP needs to run also as processes, since applications are built from
>>>>> many different libraries and legacy code which may require process model.
>>>>> Maybe all implementations would not support it, but major ones should.
>>>>>
>>>>>
>>>>>
>>>>> -Petri
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> Thank you for reading that far :-). waiting for your comments...
>>>>>
>>>>>
>>>>>
>>>>> Christophe.
>>>>>
>>>>
>>>>
>>>
>>
>
_______________________________________________
lng-odp mailing list
lng-odp@lists.linaro.org
https://lists.linaro.org/mailman/listinfo/lng-odp

Re: [lng-odp] process thread and mixed mode

Reply via email to