Re: [lng-odp] thread/shmem discussion summary V4

Yi He Wed, 08 Jun 2016 08:57:10 -0700

Hi, Christophe and team

For the shmem part:


S18: yes
S19, S20: I agree with Bill's comments are very accurate and formalized.

S21: to cover the case shm addresses(pointers) passed within data structure

especially think of cases:
3) Context pointer getters (odp_queue_context(),
odp_packet_user_ptr(), odp_timeout_user_ptr(),
odp_tm_node_context(), odp_tm_queue_context(), and
I agree with the solution, only feel it may meet difficulties
in runtime:

According to man page:
Using shmat() with shmaddr equal to NULL is the preferred,
portable way of attaching a shared memory segment. Be aware
that the shared memory segment attached in this way may be
attached at *different addresses in different processes*.
Therefore, *any pointers maintained within the shared memory *
* must be made relative* (typically to the starting address of
the segment), rather than absolute.
I found an alternate sweetheart only available in Windows,
called based pointers (C++)
https://msdn.microsoft.com/en-us/library/57a97k4e.aspx

Maybe we can spent some time to look for a counterpart in
std C world, that will be perfect.

S23: agree with Bill's comments covered the cases.

Thanks and best regards, Yi

On 8 June 2016 at 17:04, Christophe Milard <christophe.mil...@linaro.org>
wrote:

> OK. good that you agree (and please update the shared doc so it
> becomes the central point of information).
> There is something I like though, in your willingness to decouple the
> function and the pinning...
> Even if I am not sure this can be enforced by ODP at all time (as
> already stated), there is definitively a point in helping the
> application that with to do so. So please keep an eye on that!
>
> Your opinion on S19 S20 and S21 would be very welcome as well... This
> is the main hurting point....
>
> Christophe
>
> On 8 June 2016 at 09:41, Yi He <yi...@linaro.org> wrote:
> > Hi, thanks Christophe and happy to discuss with and learn from team in
> > yesterday's ARCH call :)
> >
> > The question which triggers this kind of thinking is: how to use ODP as a
> > framework to produce re-usable building blocks to compose "Network
> Function
> > Instance" in runtime, since until runtime it will prepare resources for
> > function to settle down, thus comes the thought of seperating function
> > implementation and launching.
> >
> > I agree your point it seems upper layer consideration, I'll have some
> time
> > to gain deeper understanding and knowledge on how upstair playings, thus
> I
> > agree to the current S11, S12, S13, S14, S15, S16, S17 approach and we
> can
> > revisit if really realized upper layer programming practise/model affects
> > ODP's design in this aspect.
> >
> > Best Regards, Yi
> >
> > On 7 June 2016 at 16:47, Christophe Milard <christophe.mil...@linaro.org
> >
> > wrote:
> >>
> >> On 7 June 2016 at 10:22, Yi He <yi...@linaro.org> wrote:
> >> > Hi, team
> >> >
> >> > I send my thoughts on the ODP thread part:
> >> >
> >> > S1, S2, S3, S4
> >> > Yi: Yes
> >> > This set of statements defines ODP thread concept as a higher
> >> > level abstraction of underlying concurrent execution context.
> >> >
> >> > S5, S6, S7, S8, S9, S10:
> >> > Yi: Yes
> >> > This set of statements add several constraints upon ODP
> >> > instance concept.
> >> >
> >> > S11, S12, S13, S14, S15, S16, S17:
> >> > Yi: Not very much
> >> >
> >> > Currently ODP application still mixes the Function Implementation
> >> > and its Deployment, by this I mean in ODP application code, there
> >> > is code to implement Function, there is also code to deploy the
> >> > Function onto platform resources (CPU cores).
> >> >
> >> > I'd like to propose a programming practice/model as an attempt to
> >> > decouple these two things, ODP application code only focuses on
> >> > implementing Function, and leave it to a deployment script or
> >> > launcher to take care the deployment with different resource spec
> >> > (and sufficiency check also).
> >>
> >> Well, that goes straight against Barry's will that apps should pin
> >> their tasks  on cpus...
> >> Please join the public call today: if Barry is there you can discuss
> >> this: I am happy to hear other voices in this discussion
> >>
> >> >
> >> > /* Use Case: a upper layer orchestrator, say Daylight is deploying
> >> > * an ODP application (can be a VNF instance in my opinion) onto
> >> > * some platform:
> >> > */
> >> >
> >> > /* The application can accept command line options */
> >> > --list-launchable
> >> > list all algorithm functions that need to be arranged
> >> > into an execution unit.
> >> >
> >> > --preparing-threads
> >> > control<1,2,3>@cpu<1>, worker<1,2,3,4>@cpu<2,3,4,5>
> >> >     I'm not sure if the control/worker distinguish is needed
> >> > anymore, DPDK has similar concept lcore.
> >> >
> >> >   --wait-launching-signal
> >> >       main process can pause after preparing above threads but
> >> > before launching algorithm functions, here orchestrator can
> >> > further fine-tuning the threads by scripting (cgroups, etc).
> >> > CPU, disk/net IO, memory quotas, interrupt bindings, etc.
> >> >
> >> >     --launching
> >> >       main@control<1>,algorithm_one@control<2>,
> >> >             algorithm_two@worker<1,2,3,4>...
> >> >
> >> > In source code, the only thing ODP library and application need to do
> >> > related to deployment is to declare launchable algorithm functions by
> >> > adding them into special text section:
> >> >
> >> > int main(...) __attribute__((section(".launchable")));
> >> > int algorithm_one(void *) __attribute__((section(".launchable")));
> >>
> >> interresting ... Does it need to be part of ODP, though? Cannot the
> >> ODP api provide means of pinning, and the apps that wish it can
> >> provide the options you mentioned to do it: i.e. the application that
> >> wants would pin according to its command line interface and the
> >> application that can't/does not want would pin manually...
> >> I mean one could also imagine cases where the app needs to pin: if
> >> specific HW exists connected to some cpus, or if serialisation exists
> >> (e.g. something like: pin tasks 1,2,and 3 to cpu 1,2,3, and then, when
> >> this work is done (e.g. after a barrier joining task 1,2,3) pin taks
> >> 4,5,6 on cpu 1,2,3 again.) There could be theoriticaly complex such
> >> dependency graphs where all cpus cannot be pinned from the beginning,
> >> and where your approach seem too restrictive.
> >>
> >> But I am glad to hear your voice on the arch call :-)
> >>
> >> Christophe.
> >> >
> >> > Best Regards, Yi
> >> >
> >> >
> >> > On 4 June 2016 at 05:56, Bill Fischofer <bill.fischo...@linaro.org>
> >> > wrote:
> >> >>
> >> >> I realized I forgot to respond to s23.  Corrected here.
> >> >>
> >> >> On Fri, Jun 3, 2016 at 4:15 AM, Christophe Milard <
> >> >> christophe.mil...@linaro.org> wrote:
> >> >>
> >> >> > since V3: Update following Bill's comments
> >> >> > since V2: Update following Barry and Bill's comments
> >> >> > since V1: Update following arch call 31 may 2016
> >> >> >
> >> >> > This is a tentative to sum up the discussions around the
> >> >> > thread/process
> >> >> > that have been happening these last weeks.
> >> >> > Sorry for the formalism of this mail, but it seems we need accuracy
> >> >> > here...
> >> >> >
> >> >> > This summary is organized as follows:
> >> >> >
> >> >> > It is a set of statements, each of them expecting a separate answer
> >> >> > from you. When no specific ODP version is specified, the statement
> >> >> > regards the"ultimate" goal (i.e what we want eventually to
> achieve).
> >> >> > Each statement is prefixed with:
> >> >> >   - a statement number for further reference (e.g. S1)
> >> >> >   - a status word (one of 'agreed' or 'open', or 'closed').
> >> >> > Agreed statements expect a yes/no answers: 'yes' meaning that you
> >> >> > acknowledge that this is your understanding of the agreement and
> will
> >> >> > not nack an implementation based on this statement. You can comment
> >> >> > after a yes, but your comment will not block any implementation
> based
> >> >> > on the agreed statement. A 'no' implies that the statement does not
> >> >> > reflect your understanding of the agreement, or you refuse the
> >> >> > proposal.
> >> >> > Any 'no' received on an 'agreed' statement will push it back as
> >> >> > 'open'.
> >> >> > Open statements are fully open for further discussion.
> >> >> >
> >> >> > S1  -agreed: an ODP thread is an OS/platform concurrent execution
> >> >> > environment object (as opposed to an ODP objects). No more specific
> >> >> > definition is given by the ODP API itself.
> >> >> >
> >> >> > Barry: YES
> >> >> > Bill: Yes
> >> >> >
> >> >> > ---------------------------
> >> >> >
> >> >> > S2  -agreed: Each ODP implementation must tell what is allowed to
> be
> >> >> > used as ODP thread for that specific implementation: a linux-based
> >> >> > implementation, for instance, will have to state whether odp
> threads
> >> >> > can be linux pthread, linux processes, or both, or any other type
> of
> >> >> > concurrent execution environment. ODP implementations can put any
> >> >> > restriction they wish on what an ODP thread is allowed to be. This
> >> >> > should be documented in the ODP implementation documentation.
> >> >> >
> >> >> > Barry: YES
> >> >> > Bill: Yes
> >> >> >
> >> >> > ---------------------------
> >> >> >
> >> >> > S3  -agreed: in the linux generic ODP implementation a odpthread
> will
> >> >> > be
> >> >> > either:
> >> >> >         * a linux process descendant (or same as) the odp
> >> >> > instantiation
> >> >> > process.
> >> >> >         * a pthread 'member' of a linux process descendant (or same
> >> >> > as) the odp instantiation process.
> >> >> >
> >> >> > Barry: YES
> >> >> > Bill: Yes
> >> >> >
> >> >> > ---------------------------
> >> >> >
> >> >> > S4  -agreed: For monarch, the linux generic ODP implementation only
> >> >> > supports odp thread as pthread member of the instantiation process.
> >> >> >
> >> >> > Barry: YES
> >> >> > Bill: Yes
> >> >> >
> >> >> > ---------------------------
> >> >> >
> >> >> > S5  -agreed: whether multiple instances of ODP can be run on the
> same
> >> >> > machine is left as a implementation decision. The ODP
> implementation
> >> >> > document should state what is supported and any restriction is
> >> >> > allowed.
> >> >> >
> >> >> > Barry: YES
> >> >> > Bill: Yes
> >> >> >
> >> >> > ---------------------------
> >> >> >
> >> >> > S6  -agreed: The l-g odp implementation will support multiple odp
> >> >> > instances whose instantiation processes are different and not
> >> >> > ancestor/descendant of each others. Different instances of ODP
> will,
> >> >> > of course, be restricted in sharing common OS ressources (The total
> >> >> > amount of memory available for each ODP instances may decrease as
> the
> >> >> > number of instances increases, the access to network interfaces
> will
> >> >> > probably be granted to the first instance grabbing the interface
> and
> >> >> > denied to others... some other rule may apply when sharing other
> >> >> > common ODP ressources.)
> >> >> >
> >> >> > Bill: Yes
> >> >> >
> >> >> > ---------------------------
> >> >> >
> >> >> > S7  -agreed: the l-g odp implementation will not support multiple
> ODP
> >> >> > instances initiated from the same linux process (calling
> >> >> > odp_init_global() multiple times).
> >> >> > As an illustration, This means that a single process P is not
> allowed
> >> >> > to execute the following calls (in any order)
> >> >> > instance1 = odp_init_global()
> >> >> > instance2 = odp_init_global()
> >> >> > pthread_create (and, in that thread, run odp_local_init(instance1)
> )
> >> >> > pthread_create (and, in that thread, run odp_local_init(instance2)
> )
> >> >> >
> >> >> > Bill: Yes
> >> >> >
> >> >> > -------------------
> >> >> >
> >> >> > S8  -agreed: the l-g odp implementation will not support multiple
> ODP
> >> >> > instances initiated from related linux processes
> (descendant/ancestor
> >> >> > of each other), hence enabling ODP 'sub-instance'? As an
> >> >> > illustration,
> >> >> > this means that the following is not supported:
> >> >> > instance1 = odp_init_global()
> >> >> > pthread_create (and, in that thread, run odp_local_init(instance1)
> )
> >> >> > if (fork()==0) {
> >> >> >     instance2 = odp_init_global()
> >> >> >     pthread_create (and, in that thread, run
> >> >> > odp_local_init(instance2) )
> >> >> > }
> >> >> >
> >> >> > Bill: Yes
> >> >> >
> >> >> > --------------------
> >> >> >
> >> >> > S9  -agreed: the odp instance passed as parameter to
> odp_local_init()
> >> >> > must always be one of the odp_instance returned by
> odp_global_init()
> >> >> >
> >> >> > Barry: YES
> >> >> > Bill: Yes
> >> >> >
> >> >> > ---------------------------
> >> >> >
> >> >> > S10 -agreed: For l-g, if the answer to S7 and S8 are 'yes', then
> due
> >> >> > to
> >> >> > S3,
> >> >> > the odp_instance an odp_thread can attach to is completely defined
> by
> >> >> > the ancestor of the thread, making the odp_instance parameter of
> >> >> > odp_init_local redundant. The odp l-g implementation guide will
> >> >> > enlighten this
> >> >> > redundancy, but will stress that even in this case the parameter to
> >> >> > odp_local_init() still have to be set correctly, as its usage is
> >> >> > internal to the implementation.
> >> >> >
> >> >> > Barry: I think so
> >> >> > Bill: This practice also ensures that applications behave unchanged
> >> >> > if
> >> >> > and when multi-instance support is added, so I don't think we need
> to
> >> >> > be apologetic about this parameter requirement.
> >> >> >
> >> >> > ---------------------------
> >> >> >
> >> >> > S11 -agreed: at odp_global_init() time, the application will
> provide
> >> >> > 3
> >> >> > sets of cpu (i.e 3 cpu masks):
> >> >> >         -the control cpu mask
> >> >> >         -the worker cpu mask
> >> >> >         -the odp service cpu mask (i.e the set of cpu odp can take
> >> >> > for
> >> >> > its own usage)
> >> >> > Note: The service CPU mask will be introdused post monarch
> >> >> >
> >> >> > Bill: Yes
> >> >> > Barry: YES
> >> >> > ---------------------------
> >> >> >
> >> >> > S12 -agreed: the odp implementation may return an error at
> >> >> > odp_init_global() call if the number of cpu in the odp service mask
> >> >> > (or their 'position') does not match the ODP implementation need.
> >> >> >
> >> >> > Barry: YES
> >> >> > Bill: Yes. However, an implementation may fail an odp_init_global()
> >> >> > call
> >> >> > for any resource insufficiency, not just cpus.
> >> >> >
> >> >> >
> >> >> > ---------------------------
> >> >> >
> >> >> > S13 -agreed: the application is fully responsible of pinning its
> own
> >> >> > odp threads to different cpus, and this is done directly through OS
> >> >> > system calls, or via helper functions (as opposed to ODP API
> calls).
> >> >> > This pinning should be made among cpus member of the worker cpu
> mask
> >> >> > or the control cpu mask.
> >> >> >
> >> >> > Barry: YES, but I support the existence of helper functions to do
> >> >> > this
> >> >> > – including the
> >> >> > important case of pinning the main thread
> >> >> >
> >> >> > Bill: Yes. And agree an ODP helper is useful here (which is why
> >> >> > odp-linux
> >> >> > provides one).
> >> >> >
> >> >> > ---------------------------
> >> >> >
> >> >> > S14 -agreed: whether more than one odp thread can be pinned to the
> >> >> > same cpu is left as an implementation choice (and the answer to
> that
> >> >> > question can be different for the service, worker and control
> >> >> > threads). This choice should be well documented in the
> implementation
> >> >> > user manual.
> >> >> >
> >> >> > Barry: YES
> >> >> > Bill: Yes
> >> >> >
> >> >> > ---------------------------
> >> >> >
> >> >> > S15 -agreed: the odp implementation is responsible of pinning its
> own
> >> >> > service threads among the cpu member of the odp service cpu mask.
> >> >> >
> >> >> > Barry: YES,  in principle – BUT be aware that currently the l-g ODP
> >> >> > implementation
> >> >> > (and perhaps many others) cannot call the helper functions (unless
> >> >> > inlined),
> >> >> > so this internal pinning may not be well coordinated with the
> >> >> > helpers.
> >> >> >
> >> >> > Bill: Yes.  And I agree with Barry on the helper recursion issue.
> We
> >> >> > should
> >> >> > fix that so there is no conflict between implementation internal
> >> >> > pinning
> >> >> > and application pinning attempts.
> >> >> >
> >> >> > ---------------------------
> >> >> >
> >> >> > S16 -open: why does the odp implementation need to know the control
> >> >> > and
> >> >> > worker mask? If S13 is true, shoudln't these two masks be part of
> the
> >> >> > helper only? (meaning that S11 is wrong)
> >> >> >
> >> >> > Barry: Currently it probably doesn’t NEED them, but perhaps in the
> >> >> > future, with some
> >> >> > new API’s and capabilities, it might benefit from this information,
> >> >> > and so I would leave them in.
> >> >> >
> >> >> > Bill: The implementation sees these because they are how a
> >> >> > provisioning
> >> >> > agent (e.g., OpenDaylight) would pass higher-level configuration
> >> >> > information through the application to the underlying ODP
> >> >> > implementation.
> >> >> > The initial masks specified on odp_init_global() are used in the
> >> >> > implementation of the odp_cpumask_default_worker(),
> >> >> > odp_cpumask_default_control(), and odp_cpumask_all_available()
> APIs.
> >> >> >
> >> >> > ---------------------------
> >> >> >
> >> >> > S17 -open: should masks passed as parameter to odp_init_global()
> have
> >> >> > the
> >> >> > same "namespace" as those used internally within ODP?
> >> >> >
> >> >> > Barry: YES
> >> >> > Bill: Yes. I'm not sure what it would mean for them to be in a
> >> >> > different
> >> >> > namespace. How would those be bridged if they weren't?
> >> >> >
> >> >> >
> >> >> > ---------------------------
> >> >> >
> >> >> > S18 -agreed: ODP handles are valid over the whole ODP instance,
> i.e.
> >> >> > any odp handle remains valid among all the odpthreads of the ODP
> >> >> > instance regardless of the odp thread type (process, thread or
> >> >> > whatever): an ODP thread A can pass an odp handle to onother ODP
> >> >> > thread B (using any kind of IPC), and B can use the handle.
> >> >> >
> >> >> > Bill: Yes
> >> >> >
> >> >> > -----------------
> >> >> >
> >> >> > S19 -open : any pointer retrieved by an ODP call (such as
> >> >> > odp_*_get_addr()) follows the rules defined by the OS, with the
> >> >> > possible exception defined in S21. For the linux generic ODP
> >> >> > implementation, this means that
> >> >> > pointers are fully shareable when using pthreads and that pointers
> >> >> > pointing to shared mem areas will be shareable as long as the
> fork()
> >> >> > happens after the shm_reserve().
> >> >> >
> >> >> > Barry: NO. Disagree.  I would prefer to see a consistent ODP answer
> >> >> > on
> >> >> > this topic, and in
> >> >> > particular I don’t even believe that most OS’s “have rules defining
> >> >> > …”,
> >> >> > since
> >> >> > in fact one can make programs run under Linux which can share
> >> >> > pointers
> >> >> > regardless
> >> >> > the ordering of fork() calls.  Most OS have lots of (continually
> >> >> > evolving) capabilities
> >> >> > in the category of sharing memory and so “following the rules of
> the
> >> >> > OS”
> >> >> > is not
> >> >> > well defined.
> >> >> > Instead, I prefer a simpler rule.  Memory reserved using the
> special
> >> >> > flag is guaranteed
> >> >> > to use the same addresses across processes, and all other pointers
> >> >> > are
> >> >> > not guaranteed
> >> >> > to be the same nor guaranteed to be different, so the ODP
> programmer
> >> >> > should avoid
> >> >> > any such assumptions for maximum portability.  But of course
> >> >> > programmers
> >> >> > often
> >> >> > only consider a subset of possible targets (e.g. how many
> programmers
> >> >> > consider porting
> >> >> > to an 8-bit CPU or a machine with a 36-bit word length), and so
> they
> >> >> > may happily take advantage
> >> >> > of certain non-guaranteed assumptions.
> >> >> >
> >> >> >
> >> >> > Bill: As I noted earlier we have to distinguish between different
> >> >> > types
> >> >> > of
> >> >> > memory and where these pointers come from. If the application is
> >> >> > using
> >> >> > malloc() or some other OS API to get memory and then using that
> >> >> > memory's
> >> >> > address as, for example, a queue context pointer, then it is taking
> >> >> > responsibility for ensuring that these pointers are meaningful to
> >> >> > whoever
> >> >> > sees them. ODP isn't going to do anything to help there. So this
> >> >> > question
> >> >> > really only refers to addresses returned from ODP APIs. If we look
> >> >> > for
> >> >> > void
> >> >> > * returns in the ODP API we see that the only addresses ODP returns
> >> >> > are:
> >> >> >
> >> >> > 1) Those that enable addressability to buffers and packets
> >> >> > (odp_buffer_addr(), odp_packet_data(), odp_packet_offset(), etc.)
> >> >> >
> >> >> > These addresses are intended to be used within the scope of the
> >> >> > calling
> >> >> > thread and should not be assumed to have any validity outside of
> that
> >> >> > context because the buffer/packet is the durable object and any
> >> >> > addresses
> >> >> > are just (potentially transient) mappings of that object for use by
> >> >> > that
> >> >> > thread. Packet and buffer handles (not addresses) are passed
> between
> >> >> > threads via queues and the receiver issues its own such calls on
> the
> >> >> > received handles to get its own addressability to these objects.
> >> >> > Whether
> >> >> > or
> >> >> > not these addresses are the same is purely internal to an ODP
> >> >> > implementation and is not visible to the application.
> >> >> >
> >> >> > 2) Packet user areas (odp_packet_user_area()).  This API returns
> the
> >> >> > address of a preallocated user area associated with the packet
> (size
> >> >> > defined by the pool that the packet was drawn from at
> >> >> > odp_pool_create()
> >> >> > time by the max_uarea_size entry in the odp_pool_param_t). Since
> this
> >> >> > is
> >> >> > metadata associated with the packet this API may be called by any
> >> >> > thread
> >> >> > that obtains the odp_packet_t for the packet that contains that
> user
> >> >> > area.
> >> >> > However, like the packet itself, the scope of this returned address
> >> >> > is
> >> >> > the
> >> >> > calling thread. So the address returned by odp_packet_user_area()
> >> >> > should
> >> >> > not be cached or passed to any other thread. Each thread that needs
> >> >> > addressability to this area makes its own call and whether these
> >> >> > returned
> >> >> > addresses are the same or different is again internal to the
> >> >> > implementation
> >> >> > and not visible to the application. Note that just as two threads
> >> >> > should
> >> >> > not ownership of an odp_packet_t at the same time, two threads
> should
> >> >> > not
> >> >> > be trying to access the user area associated with a packet either.
> >> >> >
> >> >> > 3) Context pointer getters (odp_queue_context(),
> >> >> > odp_packet_user_ptr(),
> >> >> > odp_timeout_user_ptr(), odp_tm_node_context(),
> >> >> > odp_tm_queue_context(),
> >> >> > and
> >> >> > the user context pointer carried in the odp_crypto_params_t struct)
> >> >> >
> >> >> > These are set by the application using corresponding setter APIs or
> >> >> > provided as values in structs, so the application either obtains
> >> >> > these
> >> >> > pointers on its own, in which case it is responsible for ensuring
> >> >> > that
> >> >> > they
> >> >> > are meaningful to whoever retrieves them, or from an odp_shm_t.  So
> >> >> > these
> >> >> > are not a special case in themselves.
> >> >> >
> >> >> > 4) ODP shared memory (odp_shm_addr(), odp_shm_info()).  These APIs
> >> >> > return
> >> >> > addresses to odp_shm_t objects that are specifically created to
> >> >> > support
> >> >> > sharing. The rule here is simple: the scope of any returned shm
> >> >> > address
> >> >> > is
> >> >> > determined by the sharing flag specified at odp_shm_reserve() time.
> >> >> > ODP
> >> >> > currently defines two such flags: ODP_SHM_SW_ONLY and ODP_SHM_PROC.
> >> >> > We
> >> >> > simply need to define precisely the intended sharing scope of these
> >> >> > two
> >> >> > (or
> >> >> > any new flags we define) to answer this question.  Note that
> context
> >> >> > pointers drawn from odp_shm_t objects would then have whatever
> >> >> > sharing
> >> >> > attributes that the shm object has, thus completely defining case
> >> >> > (3).
> >> >> >
> >> >> > ---------------------
> >> >> >
> >> >> > S20 -open: by default, shmem addresses (returned by odp_shm_addr())
> >> >> > follow the OS rules, as defined by S19.
> >> >> >
> >> >> > Ola: The question is which OS rules apply (an OS can have different
> >> >> > rules
> >> >> > for
> >> >> > different OS objects, e.g. memory regions allocated using malloc
> and
> >> >> > mmap
> >> >> > will behave differently). I think the answer depends on ODP shmem
> >> >> > objects
> >> >> > are implemented. Only the ODP implementation knows how ODP shmem
> >> >> > objects
> >> >> > are created (e.g. use some OS system call, manipulate the page
> tables
> >> >> > directly). So essentially the sharability of pointers is ODP
> >> >> > implementation
> >> >> > specific (although ODP implementations on the same OS can be
> expected
> >> >> > to
> >> >> > behave the same). Conclusion: we actually don't specify anything at
> >> >> > all
> >> >> > here, it is completely up to the ODP implementation.
> >> >> > What is required/expected by ODP applications? If we don't make
> >> >> > applications happy, ODP is unlikely to succeed.
> >> >> > I think many applications are happy with a single-process thread
> >> >> > model
> >> >> > where all memory is shared and pointers can be shared freely.
> >> >> > I hear of some applications that require multi-process thread
> model,
> >> >> > I
> >> >> > expect that those applications also want to be able to share memory
> >> >> > and
> >> >> > pointers freely between them, at least memory that was specifically
> >> >> > allocated to be shared (so called shared memory regions, what's
> >> >> > otherwise
> >> >> > the purpose of such memory regions?).
> >> >> >
> >> >> > Barry: Disagree with the same comments as in S19.
> >> >> >
> >> >> > Bill: I believe my discourse on S19 completely resolves this
> >> >> > question.
> >> >> > This
> >> >> > is controlled by the share flag specified at odp_shm_reserve()
> time.
> >> >> > We
> >> >> > just need to specify the sharing scope implied by each of these and
> >> >> > then
> >> >> > it
> >> >> > is up to each implementation to see that such scope is realized.
> >> >> >
> >> >> > ---------------------
> >> >> >
> >> >> > S21 -open: shm will support and extra flag at shm_reserve() call
> >> >> > time:
> >> >> > SHM_XXX. The usage of this flag will allocate shared memory
> >> >> > guaranteed
> >> >> > to be located at the same virtual address on all odpthreads of the
> >> >> > odp_instance. Pointers to this shared memory type are therefore
> fully
> >> >> > sharable, even on odpthreads running on different VA space (e.g.
> >> >> > processes). The amount of memory which can be allocated using this
> >> >> > flag can be
> >> >> > limited to any value by the ODP implementation, down to zero bytes,
> >> >> > meaning that some odp implementation may not support this option at
> >> >> > all. The shm_reserve() will return an error in this case.The usage
> of
> >> >> > this flag by the application is therefore not recommended. The ODP
> >> >> > implementation may require a hint about the size of this area at
> >> >> > odp_init_global() call time.
> >> >> >
> >> >> > Barry: Mostly agree, except for the comment about the special flag
> >> >> > not
> >> >> > being recommended.
> >> >> >
> >> >> > Ola: Agree. Some/many applications will want to share memory
> between
> >> >> > threads/processes and must be able to do so. Some ODP platforms may
> >> >> > have
> >> >> > limitations to the amount of memory (if any) that can be shared and
> >> >> > may
> >> >> > thus fail to run certain applications. Such is life. I don't see a
> >> >> > problem
> >> >> > with that. Possibly we should remove the phrase "not recommended"
> and
> >> >> > just
> >> >> > state that portability may be limited.
> >> >> >
> >> >> >
> >> >> > Bill: Yes. As noted in S19 and S20 the intent of the share flag is
> to
> >> >> > specify desired addressability scope for the returned odp_shm_t.
> It's
> >> >> > perfectly reasonable to define multiple such scopes that may have
> >> >> > different
> >> >> > intended uses (and implementation costs).
> >> >> >
> >> >> > ------------------
> >> >> >
> >> >> > S22 -open: please put here your name suggestions for this SHM_XXX
> >> >> > flag
> >> >> > :-).
> >> >> >
> >> >> > Ola:
> >> >> > SHM_I_REALLY_WANT_TO_SHARE_THIS_MEMORY
> >> >> >
> >> >> > Bill: I previously suggested ODP_SHM_INSTANCE that specifies that
> the
> >> >> > sharing scope of this odp_shm_t is the entire ODP instance.
> >> >> >
> >> >> > ------------------
> >> >> >
> >> >> > S23 -open: The rules above define relatively well the behaviour of
> >> >> > pointer retrieved by the call to odp_shm_get_addr(). But many
> points
> >> >> > needs tobe defined regarding other ODP objects pointers: What is
> the
> >> >> > validity of a pointer to a packet, for instance? If process A
> creates
> >> >> > a packet pool P, then forks B and C, and B allocate a packet from P
> >> >> > and retrieves a pointer to a packet allocated from this P... Is
> this
> >> >> > pointer valid in A and C? In the current l-g implementation, it
> >> >> > will... Is this behaviour
> >> >> > something we wish to enforce on any odp implementation? What about
> >> >> > other objects: buffers, atomics ... Some clear rule has to be
> defined
> >> >> > here... How things behave and if this behaviour is a part of the
> ODP
> >> >> > API or just specific to different implementations...
> >> >> >
> >> >> > Ola: Perhaps we need the option to specify the
> >> >> > I_REALLY_WANT_TO_SHARE_THIS_MEMORY flag when creating all types of
> >> >> > ODP
> >> >> > pools?
> >> >> > An ODP implementation can always fail to create such a pool if the
> >> >> > sharability requirement can not be satisfied.
> >> >> > Allocation of locations used for atomic operations is the
> >> >> > responsibility
> >> >> > of the application which can (and must) choose a suitable type of
> >> >> > memory.
> >> >> > It is better that sharability is an explicit requirement from the
> >> >> > application. It should be specified as a flag parameter to the
> >> >> > different
> >> >> > calls that create/allocate regions of memory (shmem, different
> types
> >> >> > of
> >> >> > pools).
> >> >> >
> >> >> >
> >> >> > Barry:
> >> >> > Again refer to S19 answer.  Specifically it is about what is
> >> >> > GUARANTEED regarding
> >> >> > pointer validity, not whether the pointers in certain cases will
> >> >> > happen
> >> >> > to be
> >> >> > the same.  So for your example, the pointer is not guaranteed to be
> >> >> > valid in A and C,
> >> >> > but the programmer might well believe that for all the ODP
> platforms
> >> >> > and implementations
> >> >> > they expect to run on, this is very likely to be the case, in which
> >> >> > case we can’t stop them
> >> >> > from constraining their program’s portability – no more than
> >> >> > requiring
> >> >> > them to be able to
> >> >> > port to a ternary (3-valued “bit”) architecture.
> >> >> >
> >> >>
> >> >>
> >> >> Bill: See s19 response, which answers all these questions.  The fork
> >> >> case
> >> >> you mention is moot because addresses returned by ODP packet
> operations
> >> >> are
> >> >> only valid in the thread that makes those calls. The handle must be
> >> >> valid
> >> >> throughout the instance so that may affect how the implementation
> >> >> chooses
> >> >> to implement packet pool creation, but such questions are internal to
> >> >> each
> >> >> implementation and not part of the API.
> >> >>
> >> >> Atomics are application declared entities and so memory sharing is
> >> >> again
> >> >> covered by my response to s19. If the atomic is in storage the
> >> >> application
> >> >> allocated from the OS, then its sharability is the responsibility of
> >> >> the
> >> >> application. If the atomic is part of a struct that resides in an
> >> >> odp_shm_t, then its sharability is governed by the share flag of the
> >> >> odp_shm it is taken from.
> >> >>
> >> >> No additional parameters are required for pools because what is
> shared
> >> >> from
> >> >> them is handles, not addresses. As noted, when these handles are
> >> >> converted
> >> >> to addresses those addresses are only valid for the calling thread.
> All
> >> >> other threads that have access to the handle make their own calls and
> >> >> whether or not those two addresses have any relationship to each
> other
> >> >> is
> >> >> not visible to the application.
> >> >>
> >> >>
> >> >> >
> >> >>
> >> >> > ---------------------
> >> >> >
> >> >> > Thanks for your feedback!
> >> >> >
> >> >> >
> >> >> >
> >> >> >
> >> >> _______________________________________________
> >> >> lng-odp mailing list
> >> >> lng-odp@lists.linaro.org
> >> >> https://lists.linaro.org/mailman/listinfo/lng-odp
> >> >
> >> >
> >
> >
>
_______________________________________________
lng-odp mailing list
lng-odp@lists.linaro.org
https://lists.linaro.org/mailman/listinfo/lng-odp

Re: [lng-odp] thread/shmem discussion summary V4

Reply via email to