I've submitted a patch to the User Guide to summarize what I believe we've agreed to over the past several discussion sessions. I admit it doesn't look nearly as good when explicitly stated. Please review and I'll work to refine it.
On Wed, May 18, 2016 at 2:25 AM, Christophe Milard < christophe.mil...@linaro.org> wrote: > > > On 17 May 2016 at 23:19, Bill Fischofer <bill.fischo...@linaro.org> wrote: > >> >> Just to summarize the discussion from earlier today with a view towards >> how we move forward: >> >> - Because today we have threads that all share a single address >> space, ODP handles and addresses derived from them may be shared across >> the >> entire ODP instance and all threads running in it. >> >> Yes, this is my understanding. > >> >> - It's agreed that such free sharing makes life easy for application >> writers and would be desirable even when multiple processes are involved. >> >> If this means that the application is given a way to share a common > address space (e.g. use pthread or allocate ressources it needs to share > addresses on before fork), then yes. > > >> >> - The straightforward way of preserving this behavior when threads >> may run as processes is for the application to communicate as part of >> odp_init_global() (via the odp_init_t parameters) the maximum amount of >> shared memory it will use. >> >> No. this is not my understanding: As I understood yesterday, we dropped > the exception for shmem objects that was disscussed earlyer (which I think > is good). What we now say, as far as I understood, is that we go for choice > 1) depicted at the starting mail of this thread, meaning: > Addresses returned by the odp api (and any other addresses) have the scope > defined by the OS and this applies to shmem addresses as well. If an > application wants to share shmem addresses (or any others) between ODP > threads, these odp threads will have to be -assuming linux- either pthreads > or processes which forked after the shmem_reserve() (or the allocation of > whatever the pointer points to). As a result, there is no need to know the > total amount of memory the application may need in advance. > > But: > The application will expect ODP objects (referenced by handles) to be > working across processes as it should remain unaware of their internal > implementation. For instance: > A process 'A' may fork 2 processes 'B' and 'C'. B may create a pktio whose > handle is sent (using some kind of IPC) to C. My understanding is that the > application is allowed to send/receive packets to the pktio from/to B and > C. And it should work. > This means that the pktio (within ODP) must be coded is a way that enables > using it from different processes (even forked before the pktio creation). > To achieve this, the pktio should either assume 2 completely different > address spaces and cope with it (using offsets, IPC...) or/and be given a > way to allocate shared memory guaranteed to be mapped at the same address. > The second approach enables better performance. > This is what I tried to say when talking about the 2 kinds of shared > memory I could foresee: The usual odp_shmem_reserve() for the application > and a _odp_internal_shmem_reserve() (you will probably find a better name > :-) ), not visible on the API, to reserve shared memory guaranteed to be > mapped at the same address on all processes. The latter function would only > be used by ODP itself, and may take its memory from a preallocated pool (I > am not sure this is even needed on linux, but it might on other OSes). The > size of this pool should be far less than the total memory size the app is > expected to use. The memory allocated from this pool is for ODP internal > usage only. > >> >> - When ODP sees this, at odp_init_global() time it will reserve a >> single giant block of shared memory (possibly gigabytes in size) that will >> be used to satisfy all odp_shm_reserve() calls issued by this ODP >> instance. >> If the application underestimates its needs here, then the result will be >> an odp_shm_reserve() failure when this global shared memory block is >> exhausted, even if the underlying system still has more memory available. >> >> > No, this is not my understanding. hopefully the text above made it > clearer... > >> >> - With this convention, as long as processes are created after >> odp_init_global() is called, all ODP handles and addresses derived from >> them will be sharable across the entire ODP instance as before even if >> threads are executing in (child) processes created via an underlying >> linux-style fork() operation. The downside is we may have to revisit the >> current "flags" argument to odp_shm_reserve() or else have multiple >> pre-reserved blocks, one for each valid flags value. >> >> No, this is not my understanding. hopefully the text above made it > clearer... > >> >> - The alternative is to allow new shared memory objects to be created >> after process creation with the understanding that these objects will not >> be sharable across process boundaries. >> >> I thought this is what we said. > > >> Tiger Moth could support either of these conventions, however it would >> seem the first one (single reserve during odp_global_init()) would be the >> most compatible with where we are today and potentially be the more >> convenient model for current and future ODP applications. >> > > I am afraid there will be a rather large work revisiting all code for all > current ODP objects to allow process mode. There may be unpleasant grey > zones as well, or at least places which would deserve clear docs: for > instance packet address will be valid based on their pool address scope: 2 > processes can expect a packet address to be sharable as long as its *pool* > is created before the fork (not the packet itself). This comes as a > consequence from the fact that memory is actually allocated at pool > creation rather than packet creation. Maybe some ODP implementations will > allocate the packet at packet creation time, which may create > inconsistencies between ODP implementations... > > I am afraid we are just seeing the top of the iceberg at this stage, but > this has to be clarified now, before ODP grows even more. > > Christophe. > >> >> >> On Tue, May 17, 2016 at 1:45 AM, Christophe Milard < >> christophe.mil...@linaro.org> wrote: >> >>> >>> >>> On 17 May 2016 at 02:09, Bill Fischofer <bill.fischo...@linaro.org> >>> wrote: >>> >>>> >>>> >>>> On Fri, May 13, 2016 at 6:32 AM, Savolainen, Petri (Nokia - FI/Espoo) < >>>> petri.savolai...@nokia.com> wrote: >>>> >>>>> >>>>> >>>>> >>>>> >>>>> *From:* Christophe Milard [mailto:christophe.mil...@linaro.org] >>>>> *Sent:* Friday, May 13, 2016 11:16 AM >>>>> *To:* Petri Savolainen <petri.savolai...@linaro.org>; Brian Brooks < >>>>> brian.bro...@linaro.org>; Bill Fischofer <bill.fischo...@linaro.org>; >>>>> Mike Holmes <mike.hol...@linaro.org>; LNG ODP Mailman List < >>>>> lng-odp@lists.linaro.org>; Anders Roxell <anders.rox...@linaro.org> >>>>> *Cc:* Tobias Lindquist <tobias.lindqu...@ericsson.com> >>>>> *Subject:* process thread and mixed mode >>>>> >>>>> >>>>> >>>>> Hi, >>>>> >>>>> >>>>> >>>>> Let me start in a bit unusual way :-). I do not like my "running >>>>> things in process mode" patch series that much... >>>>> >>>>> But I do think it should be merged. With the mixed mode. >>>>> >>>>> >>>>> >>>>> I think mixed mode is good addition, but would need more options: e.g. >>>>> --odph-num-proc <num> --odph-num-pthread-per-proc <num> and thus can be >>>>> added later. Pure proc mode is enough to get things started. >>>>> >>>>> >>>>> >>>>> The main goal of the patch series is to push the work regarding a >>>>> proper and consistent definition of some ODP low level concepts and >>>>> objects. The patch series makes things fail, hence pushing for defining >>>>> and >>>>> coding things better, I hope. Removing "provocative" things like the >>>>> mixed >>>>> mode from it is just defeating the purpose of this series: I'll be happy >>>>> to >>>>> remove it (and plenty of other things) when we know which well documented >>>>> definition/rule makes this code not relevant. >>>>> >>>>> As long as we keep saying that odp threads could be indifferently >>>>> linux processes or pthreads, I'll be thinking that this series should >>>>> work, inclusive any kind of mixed mode. >>>>> >>>>> >>>>> >>>>> However, processes and threads are different things in linux, and I >>>>> believe this distinction should be kept within ODP: Trying to make >>>>> processes behave as pthreads for some ODP objects (such as addresses for >>>>> shmem objects) does not sound like a good idea to me. Yes, this can maybe >>>>> be done with linux -and even possibly without preallocating the memory-, >>>>> but still, linux creates 2 distinct virtual address spaces on fork() so >>>>> why >>>>> would ODP put any exception to that?. >>>>> ...And If we do want to create such exceptions (which I don't like), >>>>> shouln't ODP define its own concept of process and thread as it obviously >>>>> wouldn't match 100% the definition of the OS? >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> A lot of application internal state data may be directly managed with >>>>> OS. But I think we cannot depend 100% on OS shared data (== remove shm API >>>>> all together), since HW/implementation could have multiple ways optimize >>>>> where/how the memory is allocated/mapped (chip internal vs. external >>>>> memory, huge page mapped vs. normal page mapped, IO-MMU mapped, etc) … or >>>>> when passing pointers to ODP API, the API spec may require that the memory >>>>> is allocated from SHM (application would not know if the API function is >>>>> implemented as HW or SW). >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> I do nevertheless definitively agree with Petri when saying that that >>>>> ODP applications should be given a way to "share pointers" between >>>>> odpthreads. Taking the "process only" behavior forcing every concurrent >>>>> execution tasks to use tricky offset calculation to find their data is too >>>>> nasty and inefficient. >>>>> >>>>> >>>>> >>>>> I think this will boil down to 2 choices: >>>>> >>>>> 1) Take these object definitions from the OS and accept the OS rules. >>>>> >>>>> 2) Define our own ODP objects (and possibly defines our own rules) >>>>> >>>>> >>>>> >>>>> 1)Taking the OS definitions: >>>>> >>>>> Choice 1) is consistent as long as we accept the OS rules. Taking that >>>>> choice would mean that any address remain valid within a process only: >>>>> shmem pointers will not be able to be shared between different processes: >>>>> I >>>>> do not see that as a big problem as the applications can use pthreads if >>>>> it >>>>> needs to share the same VA, or can fork() at a conveniant time. Moreover, >>>>> existing applications which should be ported to ODP are probably OK as >>>>> they >>>>> would already be using pthreads/processes in the correct way. >>>>> >>>>> >>>>> >>>>> A more tricky aspect of choice 1) is the development of ODP itself: >>>>> Operations involving many "ODPthreads" will have to cope with both >>>>> processes and threads possibly without a clear view of what is done by the >>>>> application. ( I guess in linux if getpid()==gettid() when calling >>>>> odp_init_local() one can assume safely that the app forked...?). >>>>> I guess it is doable: assuming all odpthreads are descendant of the >>>>> instanciation process, ODP can initially reserve shared memory in >>>>> odp_init_global() for its own purpose. It has a price: within ODP (and >>>>> anything external to the app itself, such as drivers) a common VA cannot >>>>> be >>>>> assumed. ODP/drivers will have be prepared for an app fork at any time and >>>>> hence do its internal stuff as if pointers could not be shared (except >>>>> within the initially allocated memory). This will cost ODP performance. >>>>> >>>>> >>>>> >>>>> Taking choice 1 probably also mean that shmen should no longer be part >>>>> of ODP: Allocating memory is as much an OS thing as forking a process... >>>>> why would shmem objects be ODP objects while ODP thread would not be? >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> Memory is a HW resource (such as e.g. a packet interface), and may >>>>> have different performance/properties depending on SoCs (cpu local memory, >>>>> chip internal memory, memory block carved out from a shared cache, DDR not >>>>> shared with other NUMA nodes, DDR shared with other NUMA nodes, persistent >>>>> memory, …). By default, operating system would give you access to the >>>>> common external memory. SKDs typically have some way to manage shared >>>>> memory. >>>>> >>>>> >>>>> >>>>> Thread of execution is defined by the OS and language, rather than the >>>>> SoC. >>>>> >>>>> >>>>> >>>>> An implementation may need additional parameters in global_init >>>>> (implementation specific or standard), or as command line options to >>>>> prepare for a specific mode, max number of procs, max memory usage, etc. >>>>> >>>>> >>>>> >>>> >>>> I think the fundamental problem here is that we've got an ODP concept >>>> for thread of execution (odp_thread_t) but do not have any corresponding >>>> concept of address space. The result is that the odp_thread_t type is being >>>> stretched too thin. Would it not make more sense to introduce an >>>> odp_process_t type that represents the address space that one or more >>>> threads execute in? >>>> >>> >>> I fully (110%!) agree that ODP has to be aware of address spaces! >>> >>> It then becomes straightforward to say that while ODP handles have ODP >>>> instance scope, addresses derived from those handles have ODP process >>>> scope. Thus threads can freely share addresses derived from ODP handles >>>> only with other threads belonging to the same ODP process. >>>> >>> >>> I am not sure I fully understand how you see this: If odp_thread_t and >>> odp_process_t remain OS things, how would this work? I mean, once the ODP >>> app forks it would call an ODP_create_process() instead of an >>> ODP_thread_create()? And on join or wait, it would call odp_term_local() >>> with both an odp_thread_t and odp_process_t as parameter to make sure ODP >>> can figure out what's going on? I would like to see a clear scenario here. >>> The other alternative is to have ODP wrappers around fork() >>> pthread_create(), wait() and join(). then ODP has complete control and the >>> application does not need to keep the OS and ODP "in sync". >>> This allow our one definition of these objects if we really wants to >>> have them different from their linux conterpart. >>> >>> >>>> In Monarch we support a single (default) process and all ODP threads >>>> run in this single process. For Tiger Moth we'd expect to support multiple >>>> processes, each of which can contain one or more ODP threads. We can leave >>>> the process creation mechanism to be specified by the implementation, just >>>> as we do threads today, but this would solve the semantic confusion we >>>> currently have between threads and address spaces. >>>> >>>> While ODP itself would support the concept of processes, each >>>> implementation can determine the number of processes (address spaces) that >>>> are supported for a given ODP implementation, and applications would query >>>> and respond to this limit just as they do today with the thread limit. We'd >>>> probably extend the thread limit to include both an absolute number of >>>> threads as well as the max supported in each process. >>>> >>> >>> This is really saying we are wrapping fork() and pthread_create() isn't >>> it? why would ODP care about OS limit otherwise? >>> >>> >>>> >>>> The ODP shm construct thus becomes a special object that has an >>>> addressability scope that can span ODP processes. >>>> >>> >>> Don't agree: if the definition of address space remains an OS thing >>> (e.g. process), ODP should obey OS rules. >>> >>> >>>> However, we'd expect that each implementation can impose its own rules >>>> as to how this is managed. For example, on platforms that follow the Linux >>>> process model this means that shm's need to be created prior to process >>>> creation because processes are created via the fork() mechanism that >>>> ensures that shm's in the parent process are visible at the same virtual >>>> address in its children. >>>> >>> >>> Aren't you saying the opposite of the previous sentence here?. I read >>> this as if we should obey OS rules... >>> >>> >>> Christophe >>> >>> >>>> Other platforms that have more sophisticated virtual addressing >>>> capabilities (e.g., many mainframe systems) may behave differently. On such >>>> platforms a "pointer" is actually an address space identifier and an offset >>>> pair, and the HW can efficiently perform virtual address translations on >>>> multiple address spaces simultaneously. So in this case a shm is simply a >>>> separate address space that consists only of data rather than code with one >>>> or more threads of execution running in it. >>>> >>>> >>>>> >>>>> >>>>> 2) Defining new ODP objects >>>>> >>>>> I guess in this case, we'd have to create both ODP threads and ODP >>>>> processes objects and related usual methods. The advantage if this >>>>> approach >>>>> is that ODP has more control of what is being done. It also allows for a >>>>> ODP definition of things (such as the above mentioned exception), if we >>>>> really want to... >>>>> >>>>> It makes applications more portable (if ODP abstracts some of the >>>>> common OS functionality, the apps using these ODP api becomes OS >>>>> agnostic). >>>>> >>>>> As we already have done for shmem, ODP methods for odp threads and >>>>> processes could be provided to retrieve underlying OS stuff (such as >>>>> PID...) >>>>> >>>>> >>>>> The problem of having to assume non common VA within ODP remains. >>>>> >>>>> >>>>> >>>>> ... and of course where should we stop wrapping the OS...? >>>>> >>>>> >>>>> >>>>> 3) There is a third alternative as well: stating that ODP is thread >>>>> only :-). No process, no problem? >>>>> >>>>> >>>>> >>>>> ODP needs to run also as processes, since applications are built from >>>>> many different libraries and legacy code which may require process model. >>>>> Maybe all implementations would not support it, but major ones should. >>>>> >>>>> >>>>> >>>>> -Petri >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> Thank you for reading that far :-). waiting for your comments... >>>>> >>>>> >>>>> >>>>> Christophe. >>>>> >>>> >>>> >>> >> > _______________________________________________ lng-odp mailing list lng-odp@lists.linaro.org https://lists.linaro.org/mailman/listinfo/lng-odp