Re: [lng-odp] thread/shmem discussion summary V4

Christophe Milard Thu, 09 Jun 2016 07:52:17 -0700

to start with: Thanks Bill for taking the time to answer :-). I DO appreciate.


The only thing I really want us to agree on is that your approach is
really a handles-only approach (when communicating between odp
threads), regardless of what a thread is.
Because if we don't define the behaviour, in other cases then it cannot be used.

If we agree on that, then at least, I can say I understand your
proposal. As long as you say "this is the minimum we guarentee" , and
the "maximum" is not defined, I feel anyone will have to use the
minimum. => handles

I am afraid a handle only approach would have too poor performance on
some system, yes, but this another story. This is even maybe wrong. To
start with I just want to understand how your proposal could be used
by an programmer

On 9 June 2016 at 16:30, Bill Fischofer <bill.fischo...@linaro.org> wrote:
>
>
> On Thu, Jun 9, 2016 at 9:01 AM, Christophe Milard
> <christophe.mil...@linaro.org> wrote:
>>
>> On 9 June 2016 at 14:30, Bill Fischofer <bill.fischo...@linaro.org> wrote:
>> >
>> >
>> > On Thu, Jun 9, 2016 at 7:13 AM, Christophe Milard
>> > <christophe.mil...@linaro.org> wrote:
>> >>
>> >> Bill:
>> >> S19: When you write:"These addresses are intended to be used within
>> >> the scope of the calling thread and should not be assumed to have any
>> >> validity outside of that context", do you really mean this, regardless
>> >> on how ODP threads are implemented?
>> >
>> >
>> > Yes. This is simply a statement of the minimum guarantees that ODP
>> > makes.
>> > Applications that wish to ensure maximum portability should follow this
>> > practice. If they wish to make use of special knowledge outside of what
>> > the
>> > ODP API guarantees they are free to do so with the understanding that
>> > they
>> > are trading off portability for some other benefit known to them.
>>
>> But who is expected to provide that "special knowledge outside of what
>> the ODP API guarantees" ?  I guess that the fact that an packet data
>> address returned by an ODP API call is valid within all odpthreads
>> sharing the pool (i.e. if you fork after pool create, you are fine)
>> belongs to that "special knowledge"...
>> Where do you expect the user to find this kind of information if not
>> in the ODP docs?
>>
>> How would a user know that a child process can make use of an odp
>> pointer after fork? The user then have to know whether the pointer
>> points to a shared page or not... These are things that the odp
>> implementation knows...
>
>
> The object you are sharing is the packet (represented by its handle), not a
> pointer to some part of a packet mapping. That's the key distinction.
> Separating object manipulation from object mapping is one of the key
> distinguishing features of ODP that makes it possible to implement on a wide
> array of platforms that have very different internal memory models.
>
> Pointers are needed only if you want to request unmediated access to some
> part of a packet contents. If you're manipulating packet headroom or
> tailroom, for example, you're using the packet handle, not pointers. If
> you're splitting a packet you use packet handles, not pointers. If you're
> adding, removing, or copying data from a packet you're using handles, not
> pointers. The ODP APIs are handle-oriented, not pointer oriented, so that's
> what you share between various parts of your application.
>
>>
>>
>> how would a user dare  assuming things which are not documented? when
>> using pthreads, who says odp_init_local() does not invalidate some of
>> the ODP pointers?
>>
>> As a programmer, I'd go for the rule: undefined = don't use.
>>
>> following that rule would make me use handles only. Portable: yes.
>> Acceptable performance wise... not sure...
>
>
> No, that's not what we're saying. What we're saying is that packets are
> represented by handles and programs communicate by passing these handles
> around via event queues. If you want to look at the contents of a packet you
> can obtain addressability to portions of a packet via well-defined ODP API
> calls and those pointers have meaning for you. When someone else obtains
> access to the packet handle it may do the same.
>
> What is the specific use case you're worried about?  Petri mentioned wanting
> to have multiple threads work on a single packet at the same time. Ignore
> for a moment the rarity of such cases (the normal packet processing model is
> one packet per thread and the reason why we have things like ordered queues
> is so that you can process multiple packets in a single flow concurrently)
> this is trivially handled today without further specification. Presumably
> each thread wants to work on a different offset in the packet, so each such
> thread does so by simply calling odp_packet_offset(pkt, myoffset, &seglen,
> NULL) to get addressability to that part of the packet and goes about its
> business.  Note that there is no other way you could program this since you
> cannot "compute" a proper packet address from another address because of
> possible discontinuities due to segment boundaries. That's what the ODP API
> does, so you just use it as intended.
>
>>
>>
>>
>>
>> >
>> >>
>> >>
>> >> Jerrin:
>> >> S18: your answer to S18 is confusing me. see my returned question in
>> >> the doc. please answer it :-)
>> >>
>> >> All: I have updated the doc with a summary table and the comments from
>> >> Jon Rosen. Please, check that What I wrote matches what you think.
>> >>
>> >> I have also added a list of possible behaviour (numbered A to F) for
>> >> S19. Can you read these and pick your choice? If none of these
>> >> alternative match your choice, please add your description.
>> >>
>> >> Christophe
>> >>
>> >> On 9 June 2016 at 02:53, Bill Fischofer <bill.fischo...@linaro.org>
>> >> wrote:
>> >> >
>> >> >
>> >> > On Wed, Jun 8, 2016 at 10:54 AM, Yi He <yi...@linaro.org> wrote:
>> >> >>
>> >> >> Hi, Christophe and team
>> >> >>
>> >> >> For the shmem part:
>> >> >>
>> >> >> S18: yes
>> >> >> S19, S20: I agree with Bill's comments are very accurate and
>> >> >> formalized.
>> >> >>
>> >> >> S21: to cover the case shm addresses(pointers) passed within data
>> >> >> structure
>> >> >>
>> >> >> especially think of cases:
>> >> >> 3) Context pointer getters (odp_queue_context(),
>> >> >> odp_packet_user_ptr(), odp_timeout_user_ptr(),
>> >> >> odp_tm_node_context(), odp_tm_queue_context(), and
>> >> >> I agree with the solution, only feel it may meet difficulties
>> >> >> in runtime:
>> >> >>
>> >> >> According to man page:
>> >> >> Using shmat() with shmaddr equal to NULL is the preferred,
>> >> >> portable way of attaching a shared memory segment. Be aware
>> >> >> that the shared memory segment attached in this way may be
>> >> >> attached at different addresses in different processes.
>> >> >> Therefore, any pointers maintained within the shared memory
>> >> >> must be made relative (typically to the starting address of
>> >> >> the segment), rather than absolute.
>> >> >> I found an alternate sweetheart only available in Windows,
>> >> >> called based pointers (C++)
>> >> >> https://msdn.microsoft.com/en-us/library/57a97k4e.aspx
>> >> >>
>> >> >> Maybe we can spent some time to look for a counterpart in
>> >> >> std C world, that will be perfect.
>> >> >
>> >> >
>> >> > For students of programming history, or those old enough to remember,
>> >> > the
>> >> > concept of BASED storage originated in the (now ancient) programming
>> >> > language PL/I. Pointers to BASED storage were stored as offsets and
>> >> > the
>> >> > compiler automatically handled the relative addressing. They are very
>> >> > convenient for this sort of purpose.
>> >> >
>> >> >>
>> >> >>
>> >> >> S23: agree with Bill's comments covered the cases.
>> >> >>
>> >> >> Thanks and best regards, Yi
>> >> >>
>> >> >> On 8 June 2016 at 17:04, Christophe Milard
>> >> >> <christophe.mil...@linaro.org>
>> >> >> wrote:
>> >> >>>
>> >> >>> OK. good that you agree (and please update the shared doc so it
>> >> >>> becomes the central point of information).
>> >> >>> There is something I like though, in your willingness to decouple
>> >> >>> the
>> >> >>> function and the pinning...
>> >> >>> Even if I am not sure this can be enforced by ODP at all time (as
>> >> >>> already stated), there is definitively a point in helping the
>> >> >>> application that with to do so. So please keep an eye on that!
>> >> >>>
>> >> >>> Your opinion on S19 S20 and S21 would be very welcome as well...
>> >> >>> This
>> >> >>> is the main hurting point....
>> >> >>>
>> >> >>> Christophe
>> >> >>>
>> >> >>> On 8 June 2016 at 09:41, Yi He <yi...@linaro.org> wrote:
>> >> >>> > Hi, thanks Christophe and happy to discuss with and learn from
>> >> >>> > team
>> >> >>> > in
>> >> >>> > yesterday's ARCH call :)
>> >> >>> >
>> >> >>> > The question which triggers this kind of thinking is: how to use
>> >> >>> > ODP
>> >> >>> > as
>> >> >>> > a
>> >> >>> > framework to produce re-usable building blocks to compose
>> >> >>> > "Network
>> >> >>> > Function
>> >> >>> > Instance" in runtime, since until runtime it will prepare
>> >> >>> > resources
>> >> >>> > for
>> >> >>> > function to settle down, thus comes the thought of seperating
>> >> >>> > function
>> >> >>> > implementation and launching.
>> >> >>> >
>> >> >>> > I agree your point it seems upper layer consideration, I'll have
>> >> >>> > some
>> >> >>> > time
>> >> >>> > to gain deeper understanding and knowledge on how upstair
>> >> >>> > playings,
>> >> >>> > thus I
>> >> >>> > agree to the current S11, S12, S13, S14, S15, S16, S17 approach
>> >> >>> > and
>> >> >>> > we
>> >> >>> > can
>> >> >>> > revisit if really realized upper layer programming practise/model
>> >> >>> > affects
>> >> >>> > ODP's design in this aspect.
>> >> >>> >
>> >> >>> > Best Regards, Yi
>> >> >>> >
>> >> >>> > On 7 June 2016 at 16:47, Christophe Milard
>> >> >>> > <christophe.mil...@linaro.org>
>> >> >>> > wrote:
>> >> >>> >>
>> >> >>> >> On 7 June 2016 at 10:22, Yi He <yi...@linaro.org> wrote:
>> >> >>> >> > Hi, team
>> >> >>> >> >
>> >> >>> >> > I send my thoughts on the ODP thread part:
>> >> >>> >> >
>> >> >>> >> > S1, S2, S3, S4
>> >> >>> >> > Yi: Yes
>> >> >>> >> > This set of statements defines ODP thread concept as a higher
>> >> >>> >> > level abstraction of underlying concurrent execution context.
>> >> >>> >> >
>> >> >>> >> > S5, S6, S7, S8, S9, S10:
>> >> >>> >> > Yi: Yes
>> >> >>> >> > This set of statements add several constraints upon ODP
>> >> >>> >> > instance concept.
>> >> >>> >> >
>> >> >>> >> > S11, S12, S13, S14, S15, S16, S17:
>> >> >>> >> > Yi: Not very much
>> >> >>> >> >
>> >> >>> >> > Currently ODP application still mixes the Function
>> >> >>> >> > Implementation
>> >> >>> >> > and its Deployment, by this I mean in ODP application code,
>> >> >>> >> > there
>> >> >>> >> > is code to implement Function, there is also code to deploy
>> >> >>> >> > the
>> >> >>> >> > Function onto platform resources (CPU cores).
>> >> >>> >> >
>> >> >>> >> > I'd like to propose a programming practice/model as an attempt
>> >> >>> >> > to
>> >> >>> >> > decouple these two things, ODP application code only focuses
>> >> >>> >> > on
>> >> >>> >> > implementing Function, and leave it to a deployment script or
>> >> >>> >> > launcher to take care the deployment with different resource
>> >> >>> >> > spec
>> >> >>> >> > (and sufficiency check also).
>> >> >>> >>
>> >> >>> >> Well, that goes straight against Barry's will that apps should
>> >> >>> >> pin
>> >> >>> >> their tasks  on cpus...
>> >> >>> >> Please join the public call today: if Barry is there you can
>> >> >>> >> discuss
>> >> >>> >> this: I am happy to hear other voices in this discussion
>> >> >>> >>
>> >> >>> >> >
>> >> >>> >> > /* Use Case: a upper layer orchestrator, say Daylight is
>> >> >>> >> > deploying
>> >> >>> >> > * an ODP application (can be a VNF instance in my opinion)
>> >> >>> >> > onto
>> >> >>> >> > * some platform:
>> >> >>> >> > */
>> >> >>> >> >
>> >> >>> >> > /* The application can accept command line options */
>> >> >>> >> > --list-launchable
>> >> >>> >> > list all algorithm functions that need to be arranged
>> >> >>> >> > into an execution unit.
>> >> >>> >> >
>> >> >>> >> > --preparing-threads
>> >> >>> >> > control<1,2,3>@cpu<1>, worker<1,2,3,4>@cpu<2,3,4,5>
>> >> >>> >> >     I'm not sure if the control/worker distinguish is needed
>> >> >>> >> > anymore, DPDK has similar concept lcore.
>> >> >>> >> >
>> >> >>> >> >   --wait-launching-signal
>> >> >>> >> >       main process can pause after preparing above threads but
>> >> >>> >> > before launching algorithm functions, here orchestrator can
>> >> >>> >> > further fine-tuning the threads by scripting (cgroups, etc).
>> >> >>> >> > CPU, disk/net IO, memory quotas, interrupt bindings, etc.
>> >> >>> >> >
>> >> >>> >> >     --launching
>> >> >>> >> >       main@control<1>,algorithm_one@control<2>,
>> >> >>> >> >             algorithm_two@worker<1,2,3,4>...
>> >> >>> >> >
>> >> >>> >> > In source code, the only thing ODP library and application
>> >> >>> >> > need
>> >> >>> >> > to
>> >> >>> >> > do
>> >> >>> >> > related to deployment is to declare launchable algorithm
>> >> >>> >> > functions
>> >> >>> >> > by
>> >> >>> >> > adding them into special text section:
>> >> >>> >> >
>> >> >>> >> > int main(...) __attribute__((section(".launchable")));
>> >> >>> >> > int algorithm_one(void *)
>> >> >>> >> > __attribute__((section(".launchable")));
>> >> >>> >>
>> >> >>> >> interresting ... Does it need to be part of ODP, though? Cannot
>> >> >>> >> the
>> >> >>> >> ODP api provide means of pinning, and the apps that wish it can
>> >> >>> >> provide the options you mentioned to do it: i.e. the application
>> >> >>> >> that
>> >> >>> >> wants would pin according to its command line interface and the
>> >> >>> >> application that can't/does not want would pin manually...
>> >> >>> >> I mean one could also imagine cases where the app needs to pin:
>> >> >>> >> if
>> >> >>> >> specific HW exists connected to some cpus, or if serialisation
>> >> >>> >> exists
>> >> >>> >> (e.g. something like: pin tasks 1,2,and 3 to cpu 1,2,3, and
>> >> >>> >> then,
>> >> >>> >> when
>> >> >>> >> this work is done (e.g. after a barrier joining task 1,2,3) pin
>> >> >>> >> taks
>> >> >>> >> 4,5,6 on cpu 1,2,3 again.) There could be theoriticaly complex
>> >> >>> >> such
>> >> >>> >> dependency graphs where all cpus cannot be pinned from the
>> >> >>> >> beginning,
>> >> >>> >> and where your approach seem too restrictive.
>> >> >>> >>
>> >> >>> >> But I am glad to hear your voice on the arch call :-)
>> >> >>> >>
>> >> >>> >> Christophe.
>> >> >>> >> >
>> >> >>> >> > Best Regards, Yi
>> >> >>> >> >
>> >> >>> >> >
>> >> >>> >> > On 4 June 2016 at 05:56, Bill Fischofer
>> >> >>> >> > <bill.fischo...@linaro.org>
>> >> >>> >> > wrote:
>> >> >>> >> >>
>> >> >>> >> >> I realized I forgot to respond to s23.  Corrected here.
>> >> >>> >> >>
>> >> >>> >> >> On Fri, Jun 3, 2016 at 4:15 AM, Christophe Milard <
>> >> >>> >> >> christophe.mil...@linaro.org> wrote:
>> >> >>> >> >>
>> >> >>> >> >> > since V3: Update following Bill's comments
>> >> >>> >> >> > since V2: Update following Barry and Bill's comments
>> >> >>> >> >> > since V1: Update following arch call 31 may 2016
>> >> >>> >> >> >
>> >> >>> >> >> > This is a tentative to sum up the discussions around the
>> >> >>> >> >> > thread/process
>> >> >>> >> >> > that have been happening these last weeks.
>> >> >>> >> >> > Sorry for the formalism of this mail, but it seems we need
>> >> >>> >> >> > accuracy
>> >> >>> >> >> > here...
>> >> >>> >> >> >
>> >> >>> >> >> > This summary is organized as follows:
>> >> >>> >> >> >
>> >> >>> >> >> > It is a set of statements, each of them expecting a
>> >> >>> >> >> > separate
>> >> >>> >> >> > answer
>> >> >>> >> >> > from you. When no specific ODP version is specified, the
>> >> >>> >> >> > statement
>> >> >>> >> >> > regards the"ultimate" goal (i.e what we want eventually to
>> >> >>> >> >> > achieve).
>> >> >>> >> >> > Each statement is prefixed with:
>> >> >>> >> >> >   - a statement number for further reference (e.g. S1)
>> >> >>> >> >> >   - a status word (one of 'agreed' or 'open', or 'closed').
>> >> >>> >> >> > Agreed statements expect a yes/no answers: 'yes' meaning
>> >> >>> >> >> > that
>> >> >>> >> >> > you
>> >> >>> >> >> > acknowledge that this is your understanding of the
>> >> >>> >> >> > agreement
>> >> >>> >> >> > and
>> >> >>> >> >> > will
>> >> >>> >> >> > not nack an implementation based on this statement. You can
>> >> >>> >> >> > comment
>> >> >>> >> >> > after a yes, but your comment will not block any
>> >> >>> >> >> > implementation
>> >> >>> >> >> > based
>> >> >>> >> >> > on the agreed statement. A 'no' implies that the statement
>> >> >>> >> >> > does
>> >> >>> >> >> > not
>> >> >>> >> >> > reflect your understanding of the agreement, or you refuse
>> >> >>> >> >> > the
>> >> >>> >> >> > proposal.
>> >> >>> >> >> > Any 'no' received on an 'agreed' statement will push it
>> >> >>> >> >> > back
>> >> >>> >> >> > as
>> >> >>> >> >> > 'open'.
>> >> >>> >> >> > Open statements are fully open for further discussion.
>> >> >>> >> >> >
>> >> >>> >> >> > S1  -agreed: an ODP thread is an OS/platform concurrent
>> >> >>> >> >> > execution
>> >> >>> >> >> > environment object (as opposed to an ODP objects). No more
>> >> >>> >> >> > specific
>> >> >>> >> >> > definition is given by the ODP API itself.
>> >> >>> >> >> >
>> >> >>> >> >> > Barry: YES
>> >> >>> >> >> > Bill: Yes
>> >> >>> >> >> >
>> >> >>> >> >> > ---------------------------
>> >> >>> >> >> >
>> >> >>> >> >> > S2  -agreed: Each ODP implementation must tell what is
>> >> >>> >> >> > allowed
>> >> >>> >> >> > to
>> >> >>> >> >> > be
>> >> >>> >> >> > used as ODP thread for that specific implementation: a
>> >> >>> >> >> > linux-based
>> >> >>> >> >> > implementation, for instance, will have to state whether
>> >> >>> >> >> > odp
>> >> >>> >> >> > threads
>> >> >>> >> >> > can be linux pthread, linux processes, or both, or any
>> >> >>> >> >> > other
>> >> >>> >> >> > type
>> >> >>> >> >> > of
>> >> >>> >> >> > concurrent execution environment. ODP implementations can
>> >> >>> >> >> > put
>> >> >>> >> >> > any
>> >> >>> >> >> > restriction they wish on what an ODP thread is allowed to
>> >> >>> >> >> > be.
>> >> >>> >> >> > This
>> >> >>> >> >> > should be documented in the ODP implementation
>> >> >>> >> >> > documentation.
>> >> >>> >> >> >
>> >> >>> >> >> > Barry: YES
>> >> >>> >> >> > Bill: Yes
>> >> >>> >> >> >
>> >> >>> >> >> > ---------------------------
>> >> >>> >> >> >
>> >> >>> >> >> > S3  -agreed: in the linux generic ODP implementation a
>> >> >>> >> >> > odpthread
>> >> >>> >> >> > will
>> >> >>> >> >> > be
>> >> >>> >> >> > either:
>> >> >>> >> >> >         * a linux process descendant (or same as) the odp
>> >> >>> >> >> > instantiation
>> >> >>> >> >> > process.
>> >> >>> >> >> >         * a pthread 'member' of a linux process descendant
>> >> >>> >> >> > (or
>> >> >>> >> >> > same
>> >> >>> >> >> > as) the odp instantiation process.
>> >> >>> >> >> >
>> >> >>> >> >> > Barry: YES
>> >> >>> >> >> > Bill: Yes
>> >> >>> >> >> >
>> >> >>> >> >> > ---------------------------
>> >> >>> >> >> >
>> >> >>> >> >> > S4  -agreed: For monarch, the linux generic ODP
>> >> >>> >> >> > implementation
>> >> >>> >> >> > only
>> >> >>> >> >> > supports odp thread as pthread member of the instantiation
>> >> >>> >> >> > process.
>> >> >>> >> >> >
>> >> >>> >> >> > Barry: YES
>> >> >>> >> >> > Bill: Yes
>> >> >>> >> >> >
>> >> >>> >> >> > ---------------------------
>> >> >>> >> >> >
>> >> >>> >> >> > S5  -agreed: whether multiple instances of ODP can be run
>> >> >>> >> >> > on
>> >> >>> >> >> > the
>> >> >>> >> >> > same
>> >> >>> >> >> > machine is left as a implementation decision. The ODP
>> >> >>> >> >> > implementation
>> >> >>> >> >> > document should state what is supported and any restriction
>> >> >>> >> >> > is
>> >> >>> >> >> > allowed.
>> >> >>> >> >> >
>> >> >>> >> >> > Barry: YES
>> >> >>> >> >> > Bill: Yes
>> >> >>> >> >> >
>> >> >>> >> >> > ---------------------------
>> >> >>> >> >> >
>> >> >>> >> >> > S6  -agreed: The l-g odp implementation will support
>> >> >>> >> >> > multiple
>> >> >>> >> >> > odp
>> >> >>> >> >> > instances whose instantiation processes are different and
>> >> >>> >> >> > not
>> >> >>> >> >> > ancestor/descendant of each others. Different instances of
>> >> >>> >> >> > ODP
>> >> >>> >> >> > will,
>> >> >>> >> >> > of course, be restricted in sharing common OS ressources
>> >> >>> >> >> > (The
>> >> >>> >> >> > total
>> >> >>> >> >> > amount of memory available for each ODP instances may
>> >> >>> >> >> > decrease
>> >> >>> >> >> > as
>> >> >>> >> >> > the
>> >> >>> >> >> > number of instances increases, the access to network
>> >> >>> >> >> > interfaces
>> >> >>> >> >> > will
>> >> >>> >> >> > probably be granted to the first instance grabbing the
>> >> >>> >> >> > interface
>> >> >>> >> >> > and
>> >> >>> >> >> > denied to others... some other rule may apply when sharing
>> >> >>> >> >> > other
>> >> >>> >> >> > common ODP ressources.)
>> >> >>> >> >> >
>> >> >>> >> >> > Bill: Yes
>> >> >>> >> >> >
>> >> >>> >> >> > ---------------------------
>> >> >>> >> >> >
>> >> >>> >> >> > S7  -agreed: the l-g odp implementation will not support
>> >> >>> >> >> > multiple
>> >> >>> >> >> > ODP
>> >> >>> >> >> > instances initiated from the same linux process (calling
>> >> >>> >> >> > odp_init_global() multiple times).
>> >> >>> >> >> > As an illustration, This means that a single process P is
>> >> >>> >> >> > not
>> >> >>> >> >> > allowed
>> >> >>> >> >> > to execute the following calls (in any order)
>> >> >>> >> >> > instance1 = odp_init_global()
>> >> >>> >> >> > instance2 = odp_init_global()
>> >> >>> >> >> > pthread_create (and, in that thread, run
>> >> >>> >> >> > odp_local_init(instance1) )
>> >> >>> >> >> > pthread_create (and, in that thread, run
>> >> >>> >> >> > odp_local_init(instance2) )
>> >> >>> >> >> >
>> >> >>> >> >> > Bill: Yes
>> >> >>> >> >> >
>> >> >>> >> >> > -------------------
>> >> >>> >> >> >
>> >> >>> >> >> > S8  -agreed: the l-g odp implementation will not support
>> >> >>> >> >> > multiple
>> >> >>> >> >> > ODP
>> >> >>> >> >> > instances initiated from related linux processes
>> >> >>> >> >> > (descendant/ancestor
>> >> >>> >> >> > of each other), hence enabling ODP 'sub-instance'? As an
>> >> >>> >> >> > illustration,
>> >> >>> >> >> > this means that the following is not supported:
>> >> >>> >> >> > instance1 = odp_init_global()
>> >> >>> >> >> > pthread_create (and, in that thread, run
>> >> >>> >> >> > odp_local_init(instance1) )
>> >> >>> >> >> > if (fork()==0) {
>> >> >>> >> >> >     instance2 = odp_init_global()
>> >> >>> >> >> >     pthread_create (and, in that thread, run
>> >> >>> >> >> > odp_local_init(instance2) )
>> >> >>> >> >> > }
>> >> >>> >> >> >
>> >> >>> >> >> > Bill: Yes
>> >> >>> >> >> >
>> >> >>> >> >> > --------------------
>> >> >>> >> >> >
>> >> >>> >> >> > S9  -agreed: the odp instance passed as parameter to
>> >> >>> >> >> > odp_local_init()
>> >> >>> >> >> > must always be one of the odp_instance returned by
>> >> >>> >> >> > odp_global_init()
>> >> >>> >> >> >
>> >> >>> >> >> > Barry: YES
>> >> >>> >> >> > Bill: Yes
>> >> >>> >> >> >
>> >> >>> >> >> > ---------------------------
>> >> >>> >> >> >
>> >> >>> >> >> > S10 -agreed: For l-g, if the answer to S7 and S8 are 'yes',
>> >> >>> >> >> > then
>> >> >>> >> >> > due
>> >> >>> >> >> > to
>> >> >>> >> >> > S3,
>> >> >>> >> >> > the odp_instance an odp_thread can attach to is completely
>> >> >>> >> >> > defined by
>> >> >>> >> >> > the ancestor of the thread, making the odp_instance
>> >> >>> >> >> > parameter
>> >> >>> >> >> > of
>> >> >>> >> >> > odp_init_local redundant. The odp l-g implementation guide
>> >> >>> >> >> > will
>> >> >>> >> >> > enlighten this
>> >> >>> >> >> > redundancy, but will stress that even in this case the
>> >> >>> >> >> > parameter
>> >> >>> >> >> > to
>> >> >>> >> >> > odp_local_init() still have to be set correctly, as its
>> >> >>> >> >> > usage
>> >> >>> >> >> > is
>> >> >>> >> >> > internal to the implementation.
>> >> >>> >> >> >
>> >> >>> >> >> > Barry: I think so
>> >> >>> >> >> > Bill: This practice also ensures that applications behave
>> >> >>> >> >> > unchanged
>> >> >>> >> >> > if
>> >> >>> >> >> > and when multi-instance support is added, so I don't think
>> >> >>> >> >> > we
>> >> >>> >> >> > need to
>> >> >>> >> >> > be apologetic about this parameter requirement.
>> >> >>> >> >> >
>> >> >>> >> >> > ---------------------------
>> >> >>> >> >> >
>> >> >>> >> >> > S11 -agreed: at odp_global_init() time, the application
>> >> >>> >> >> > will
>> >> >>> >> >> > provide
>> >> >>> >> >> > 3
>> >> >>> >> >> > sets of cpu (i.e 3 cpu masks):
>> >> >>> >> >> >         -the control cpu mask
>> >> >>> >> >> >         -the worker cpu mask
>> >> >>> >> >> >         -the odp service cpu mask (i.e the set of cpu odp
>> >> >>> >> >> > can
>> >> >>> >> >> > take
>> >> >>> >> >> > for
>> >> >>> >> >> > its own usage)
>> >> >>> >> >> > Note: The service CPU mask will be introdused post monarch
>> >> >>> >> >> >
>> >> >>> >> >> > Bill: Yes
>> >> >>> >> >> > Barry: YES
>> >> >>> >> >> > ---------------------------
>> >> >>> >> >> >
>> >> >>> >> >> > S12 -agreed: the odp implementation may return an error at
>> >> >>> >> >> > odp_init_global() call if the number of cpu in the odp
>> >> >>> >> >> > service
>> >> >>> >> >> > mask
>> >> >>> >> >> > (or their 'position') does not match the ODP implementation
>> >> >>> >> >> > need.
>> >> >>> >> >> >
>> >> >>> >> >> > Barry: YES
>> >> >>> >> >> > Bill: Yes. However, an implementation may fail an
>> >> >>> >> >> > odp_init_global()
>> >> >>> >> >> > call
>> >> >>> >> >> > for any resource insufficiency, not just cpus.
>> >> >>> >> >> >
>> >> >>> >> >> >
>> >> >>> >> >> > ---------------------------
>> >> >>> >> >> >
>> >> >>> >> >> > S13 -agreed: the application is fully responsible of
>> >> >>> >> >> > pinning
>> >> >>> >> >> > its
>> >> >>> >> >> > own
>> >> >>> >> >> > odp threads to different cpus, and this is done directly
>> >> >>> >> >> > through
>> >> >>> >> >> > OS
>> >> >>> >> >> > system calls, or via helper functions (as opposed to ODP
>> >> >>> >> >> > API
>> >> >>> >> >> > calls).
>> >> >>> >> >> > This pinning should be made among cpus member of the worker
>> >> >>> >> >> > cpu
>> >> >>> >> >> > mask
>> >> >>> >> >> > or the control cpu mask.
>> >> >>> >> >> >
>> >> >>> >> >> > Barry: YES, but I support the existence of helper functions
>> >> >>> >> >> > to
>> >> >>> >> >> > do
>> >> >>> >> >> > this
>> >> >>> >> >> > – including the
>> >> >>> >> >> > important case of pinning the main thread
>> >> >>> >> >> >
>> >> >>> >> >> > Bill: Yes. And agree an ODP helper is useful here (which is
>> >> >>> >> >> > why
>> >> >>> >> >> > odp-linux
>> >> >>> >> >> > provides one).
>> >> >>> >> >> >
>> >> >>> >> >> > ---------------------------
>> >> >>> >> >> >
>> >> >>> >> >> > S14 -agreed: whether more than one odp thread can be pinned
>> >> >>> >> >> > to
>> >> >>> >> >> > the
>> >> >>> >> >> > same cpu is left as an implementation choice (and the
>> >> >>> >> >> > answer
>> >> >>> >> >> > to
>> >> >>> >> >> > that
>> >> >>> >> >> > question can be different for the service, worker and
>> >> >>> >> >> > control
>> >> >>> >> >> > threads). This choice should be well documented in the
>> >> >>> >> >> > implementation
>> >> >>> >> >> > user manual.
>> >> >>> >> >> >
>> >> >>> >> >> > Barry: YES
>> >> >>> >> >> > Bill: Yes
>> >> >>> >> >> >
>> >> >>> >> >> > ---------------------------
>> >> >>> >> >> >
>> >> >>> >> >> > S15 -agreed: the odp implementation is responsible of
>> >> >>> >> >> > pinning
>> >> >>> >> >> > its
>> >> >>> >> >> > own
>> >> >>> >> >> > service threads among the cpu member of the odp service cpu
>> >> >>> >> >> > mask.
>> >> >>> >> >> >
>> >> >>> >> >> > Barry: YES,  in principle – BUT be aware that currently the
>> >> >>> >> >> > l-g
>> >> >>> >> >> > ODP
>> >> >>> >> >> > implementation
>> >> >>> >> >> > (and perhaps many others) cannot call the helper functions
>> >> >>> >> >> > (unless
>> >> >>> >> >> > inlined),
>> >> >>> >> >> > so this internal pinning may not be well coordinated with
>> >> >>> >> >> > the
>> >> >>> >> >> > helpers.
>> >> >>> >> >> >
>> >> >>> >> >> > Bill: Yes.  And I agree with Barry on the helper recursion
>> >> >>> >> >> > issue.
>> >> >>> >> >> > We
>> >> >>> >> >> > should
>> >> >>> >> >> > fix that so there is no conflict between implementation
>> >> >>> >> >> > internal
>> >> >>> >> >> > pinning
>> >> >>> >> >> > and application pinning attempts.
>> >> >>> >> >> >
>> >> >>> >> >> > ---------------------------
>> >> >>> >> >> >
>> >> >>> >> >> > S16 -open: why does the odp implementation need to know the
>> >> >>> >> >> > control
>> >> >>> >> >> > and
>> >> >>> >> >> > worker mask? If S13 is true, shoudln't these two masks be
>> >> >>> >> >> > part
>> >> >>> >> >> > of
>> >> >>> >> >> > the
>> >> >>> >> >> > helper only? (meaning that S11 is wrong)
>> >> >>> >> >> >
>> >> >>> >> >> > Barry: Currently it probably doesn’t NEED them, but perhaps
>> >> >>> >> >> > in
>> >> >>> >> >> > the
>> >> >>> >> >> > future, with some
>> >> >>> >> >> > new API’s and capabilities, it might benefit from this
>> >> >>> >> >> > information,
>> >> >>> >> >> > and so I would leave them in.
>> >> >>> >> >> >
>> >> >>> >> >> > Bill: The implementation sees these because they are how a
>> >> >>> >> >> > provisioning
>> >> >>> >> >> > agent (e.g., OpenDaylight) would pass higher-level
>> >> >>> >> >> > configuration
>> >> >>> >> >> > information through the application to the underlying ODP
>> >> >>> >> >> > implementation.
>> >> >>> >> >> > The initial masks specified on odp_init_global() are used
>> >> >>> >> >> > in
>> >> >>> >> >> > the
>> >> >>> >> >> > implementation of the odp_cpumask_default_worker(),
>> >> >>> >> >> > odp_cpumask_default_control(), and
>> >> >>> >> >> > odp_cpumask_all_available()
>> >> >>> >> >> > APIs.
>> >> >>> >> >> >
>> >> >>> >> >> > ---------------------------
>> >> >>> >> >> >
>> >> >>> >> >> > S17 -open: should masks passed as parameter to
>> >> >>> >> >> > odp_init_global()
>> >> >>> >> >> > have
>> >> >>> >> >> > the
>> >> >>> >> >> > same "namespace" as those used internally within ODP?
>> >> >>> >> >> >
>> >> >>> >> >> > Barry: YES
>> >> >>> >> >> > Bill: Yes. I'm not sure what it would mean for them to be
>> >> >>> >> >> > in a
>> >> >>> >> >> > different
>> >> >>> >> >> > namespace. How would those be bridged if they weren't?
>> >> >>> >> >> >
>> >> >>> >> >> >
>> >> >>> >> >> > ---------------------------
>> >> >>> >> >> >
>> >> >>> >> >> > S18 -agreed: ODP handles are valid over the whole ODP
>> >> >>> >> >> > instance,
>> >> >>> >> >> > i.e.
>> >> >>> >> >> > any odp handle remains valid among all the odpthreads of
>> >> >>> >> >> > the
>> >> >>> >> >> > ODP
>> >> >>> >> >> > instance regardless of the odp thread type (process, thread
>> >> >>> >> >> > or
>> >> >>> >> >> > whatever): an ODP thread A can pass an odp handle to
>> >> >>> >> >> > onother
>> >> >>> >> >> > ODP
>> >> >>> >> >> > thread B (using any kind of IPC), and B can use the handle.
>> >> >>> >> >> >
>> >> >>> >> >> > Bill: Yes
>> >> >>> >> >> >
>> >> >>> >> >> > -----------------
>> >> >>> >> >> >
>> >> >>> >> >> > S19 -open : any pointer retrieved by an ODP call (such as
>> >> >>> >> >> > odp_*_get_addr()) follows the rules defined by the OS, with
>> >> >>> >> >> > the
>> >> >>> >> >> > possible exception defined in S21. For the linux generic
>> >> >>> >> >> > ODP
>> >> >>> >> >> > implementation, this means that
>> >> >>> >> >> > pointers are fully shareable when using pthreads and that
>> >> >>> >> >> > pointers
>> >> >>> >> >> > pointing to shared mem areas will be shareable as long as
>> >> >>> >> >> > the
>> >> >>> >> >> > fork()
>> >> >>> >> >> > happens after the shm_reserve().
>> >> >>> >> >> >
>> >> >>> >> >> > Barry: NO. Disagree.  I would prefer to see a consistent
>> >> >>> >> >> > ODP
>> >> >>> >> >> > answer
>> >> >>> >> >> > on
>> >> >>> >> >> > this topic, and in
>> >> >>> >> >> > particular I don’t even believe that most OS’s “have rules
>> >> >>> >> >> > defining
>> >> >>> >> >> > …”,
>> >> >>> >> >> > since
>> >> >>> >> >> > in fact one can make programs run under Linux which can
>> >> >>> >> >> > share
>> >> >>> >> >> > pointers
>> >> >>> >> >> > regardless
>> >> >>> >> >> > the ordering of fork() calls.  Most OS have lots of
>> >> >>> >> >> > (continually
>> >> >>> >> >> > evolving) capabilities
>> >> >>> >> >> > in the category of sharing memory and so “following the
>> >> >>> >> >> > rules
>> >> >>> >> >> > of
>> >> >>> >> >> > the
>> >> >>> >> >> > OS”
>> >> >>> >> >> > is not
>> >> >>> >> >> > well defined.
>> >> >>> >> >> > Instead, I prefer a simpler rule.  Memory reserved using
>> >> >>> >> >> > the
>> >> >>> >> >> > special
>> >> >>> >> >> > flag is guaranteed
>> >> >>> >> >> > to use the same addresses across processes, and all other
>> >> >>> >> >> > pointers
>> >> >>> >> >> > are
>> >> >>> >> >> > not guaranteed
>> >> >>> >> >> > to be the same nor guaranteed to be different, so the ODP
>> >> >>> >> >> > programmer
>> >> >>> >> >> > should avoid
>> >> >>> >> >> > any such assumptions for maximum portability.  But of
>> >> >>> >> >> > course
>> >> >>> >> >> > programmers
>> >> >>> >> >> > often
>> >> >>> >> >> > only consider a subset of possible targets (e.g. how many
>> >> >>> >> >> > programmers
>> >> >>> >> >> > consider porting
>> >> >>> >> >> > to an 8-bit CPU or a machine with a 36-bit word length),
>> >> >>> >> >> > and
>> >> >>> >> >> > so
>> >> >>> >> >> > they
>> >> >>> >> >> > may happily take advantage
>> >> >>> >> >> > of certain non-guaranteed assumptions.
>> >> >>> >> >> >
>> >> >>> >> >> >
>> >> >>> >> >> > Bill: As I noted earlier we have to distinguish between
>> >> >>> >> >> > different
>> >> >>> >> >> > types
>> >> >>> >> >> > of
>> >> >>> >> >> > memory and where these pointers come from. If the
>> >> >>> >> >> > application
>> >> >>> >> >> > is
>> >> >>> >> >> > using
>> >> >>> >> >> > malloc() or some other OS API to get memory and then using
>> >> >>> >> >> > that
>> >> >>> >> >> > memory's
>> >> >>> >> >> > address as, for example, a queue context pointer, then it
>> >> >>> >> >> > is
>> >> >>> >> >> > taking
>> >> >>> >> >> > responsibility for ensuring that these pointers are
>> >> >>> >> >> > meaningful
>> >> >>> >> >> > to
>> >> >>> >> >> > whoever
>> >> >>> >> >> > sees them. ODP isn't going to do anything to help there. So
>> >> >>> >> >> > this
>> >> >>> >> >> > question
>> >> >>> >> >> > really only refers to addresses returned from ODP APIs. If
>> >> >>> >> >> > we
>> >> >>> >> >> > look
>> >> >>> >> >> > for
>> >> >>> >> >> > void
>> >> >>> >> >> > * returns in the ODP API we see that the only addresses ODP
>> >> >>> >> >> > returns
>> >> >>> >> >> > are:
>> >> >>> >> >> >
>> >> >>> >> >> > 1) Those that enable addressability to buffers and packets
>> >> >>> >> >> > (odp_buffer_addr(), odp_packet_data(), odp_packet_offset(),
>> >> >>> >> >> > etc.)
>> >> >>> >> >> >
>> >> >>> >> >> > These addresses are intended to be used within the scope of
>> >> >>> >> >> > the
>> >> >>> >> >> > calling
>> >> >>> >> >> > thread and should not be assumed to have any validity
>> >> >>> >> >> > outside
>> >> >>> >> >> > of
>> >> >>> >> >> > that
>> >> >>> >> >> > context because the buffer/packet is the durable object and
>> >> >>> >> >> > any
>> >> >>> >> >> > addresses
>> >> >>> >> >> > are just (potentially transient) mappings of that object
>> >> >>> >> >> > for
>> >> >>> >> >> > use
>> >> >>> >> >> > by
>> >> >>> >> >> > that
>> >> >>> >> >> > thread. Packet and buffer handles (not addresses) are
>> >> >>> >> >> > passed
>> >> >>> >> >> > between
>> >> >>> >> >> > threads via queues and the receiver issues its own such
>> >> >>> >> >> > calls
>> >> >>> >> >> > on
>> >> >>> >> >> > the
>> >> >>> >> >> > received handles to get its own addressability to these
>> >> >>> >> >> > objects.
>> >> >>> >> >> > Whether
>> >> >>> >> >> > or
>> >> >>> >> >> > not these addresses are the same is purely internal to an
>> >> >>> >> >> > ODP
>> >> >>> >> >> > implementation and is not visible to the application.
>> >> >>> >> >> >
>> >> >>> >> >> > 2) Packet user areas (odp_packet_user_area()).  This API
>> >> >>> >> >> > returns
>> >> >>> >> >> > the
>> >> >>> >> >> > address of a preallocated user area associated with the
>> >> >>> >> >> > packet
>> >> >>> >> >> > (size
>> >> >>> >> >> > defined by the pool that the packet was drawn from at
>> >> >>> >> >> > odp_pool_create()
>> >> >>> >> >> > time by the max_uarea_size entry in the odp_pool_param_t).
>> >> >>> >> >> > Since
>> >> >>> >> >> > this
>> >> >>> >> >> > is
>> >> >>> >> >> > metadata associated with the packet this API may be called
>> >> >>> >> >> > by
>> >> >>> >> >> > any
>> >> >>> >> >> > thread
>> >> >>> >> >> > that obtains the odp_packet_t for the packet that contains
>> >> >>> >> >> > that
>> >> >>> >> >> > user
>> >> >>> >> >> > area.
>> >> >>> >> >> > However, like the packet itself, the scope of this returned
>> >> >>> >> >> > address
>> >> >>> >> >> > is
>> >> >>> >> >> > the
>> >> >>> >> >> > calling thread. So the address returned by
>> >> >>> >> >> > odp_packet_user_area()
>> >> >>> >> >> > should
>> >> >>> >> >> > not be cached or passed to any other thread. Each thread
>> >> >>> >> >> > that
>> >> >>> >> >> > needs
>> >> >>> >> >> > addressability to this area makes its own call and whether
>> >> >>> >> >> > these
>> >> >>> >> >> > returned
>> >> >>> >> >> > addresses are the same or different is again internal to
>> >> >>> >> >> > the
>> >> >>> >> >> > implementation
>> >> >>> >> >> > and not visible to the application. Note that just as two
>> >> >>> >> >> > threads
>> >> >>> >> >> > should
>> >> >>> >> >> > not ownership of an odp_packet_t at the same time, two
>> >> >>> >> >> > threads
>> >> >>> >> >> > should
>> >> >>> >> >> > not
>> >> >>> >> >> > be trying to access the user area associated with a packet
>> >> >>> >> >> > either.
>> >> >>> >> >> >
>> >> >>> >> >> > 3) Context pointer getters (odp_queue_context(),
>> >> >>> >> >> > odp_packet_user_ptr(),
>> >> >>> >> >> > odp_timeout_user_ptr(), odp_tm_node_context(),
>> >> >>> >> >> > odp_tm_queue_context(),
>> >> >>> >> >> > and
>> >> >>> >> >> > the user context pointer carried in the odp_crypto_params_t
>> >> >>> >> >> > struct)
>> >> >>> >> >> >
>> >> >>> >> >> > These are set by the application using corresponding setter
>> >> >>> >> >> > APIs
>> >> >>> >> >> > or
>> >> >>> >> >> > provided as values in structs, so the application either
>> >> >>> >> >> > obtains
>> >> >>> >> >> > these
>> >> >>> >> >> > pointers on its own, in which case it is responsible for
>> >> >>> >> >> > ensuring
>> >> >>> >> >> > that
>> >> >>> >> >> > they
>> >> >>> >> >> > are meaningful to whoever retrieves them, or from an
>> >> >>> >> >> > odp_shm_t.
>> >> >>> >> >> > So
>> >> >>> >> >> > these
>> >> >>> >> >> > are not a special case in themselves.
>> >> >>> >> >> >
>> >> >>> >> >> > 4) ODP shared memory (odp_shm_addr(), odp_shm_info()).
>> >> >>> >> >> > These
>> >> >>> >> >> > APIs
>> >> >>> >> >> > return
>> >> >>> >> >> > addresses to odp_shm_t objects that are specifically
>> >> >>> >> >> > created
>> >> >>> >> >> > to
>> >> >>> >> >> > support
>> >> >>> >> >> > sharing. The rule here is simple: the scope of any returned
>> >> >>> >> >> > shm
>> >> >>> >> >> > address
>> >> >>> >> >> > is
>> >> >>> >> >> > determined by the sharing flag specified at
>> >> >>> >> >> > odp_shm_reserve()
>> >> >>> >> >> > time.
>> >> >>> >> >> > ODP
>> >> >>> >> >> > currently defines two such flags: ODP_SHM_SW_ONLY and
>> >> >>> >> >> > ODP_SHM_PROC.
>> >> >>> >> >> > We
>> >> >>> >> >> > simply need to define precisely the intended sharing scope
>> >> >>> >> >> > of
>> >> >>> >> >> > these
>> >> >>> >> >> > two
>> >> >>> >> >> > (or
>> >> >>> >> >> > any new flags we define) to answer this question.  Note
>> >> >>> >> >> > that
>> >> >>> >> >> > context
>> >> >>> >> >> > pointers drawn from odp_shm_t objects would then have
>> >> >>> >> >> > whatever
>> >> >>> >> >> > sharing
>> >> >>> >> >> > attributes that the shm object has, thus completely
>> >> >>> >> >> > defining
>> >> >>> >> >> > case
>> >> >>> >> >> > (3).
>> >> >>> >> >> >
>> >> >>> >> >> > ---------------------
>> >> >>> >> >> >
>> >> >>> >> >> > S20 -open: by default, shmem addresses (returned by
>> >> >>> >> >> > odp_shm_addr())
>> >> >>> >> >> > follow the OS rules, as defined by S19.
>> >> >>> >> >> >
>> >> >>> >> >> > Ola: The question is which OS rules apply (an OS can have
>> >> >>> >> >> > different
>> >> >>> >> >> > rules
>> >> >>> >> >> > for
>> >> >>> >> >> > different OS objects, e.g. memory regions allocated using
>> >> >>> >> >> > malloc
>> >> >>> >> >> > and
>> >> >>> >> >> > mmap
>> >> >>> >> >> > will behave differently). I think the answer depends on ODP
>> >> >>> >> >> > shmem
>> >> >>> >> >> > objects
>> >> >>> >> >> > are implemented. Only the ODP implementation knows how ODP
>> >> >>> >> >> > shmem
>> >> >>> >> >> > objects
>> >> >>> >> >> > are created (e.g. use some OS system call, manipulate the
>> >> >>> >> >> > page
>> >> >>> >> >> > tables
>> >> >>> >> >> > directly). So essentially the sharability of pointers is
>> >> >>> >> >> > ODP
>> >> >>> >> >> > implementation
>> >> >>> >> >> > specific (although ODP implementations on the same OS can
>> >> >>> >> >> > be
>> >> >>> >> >> > expected
>> >> >>> >> >> > to
>> >> >>> >> >> > behave the same). Conclusion: we actually don't specify
>> >> >>> >> >> > anything
>> >> >>> >> >> > at
>> >> >>> >> >> > all
>> >> >>> >> >> > here, it is completely up to the ODP implementation.
>> >> >>> >> >> > What is required/expected by ODP applications? If we don't
>> >> >>> >> >> > make
>> >> >>> >> >> > applications happy, ODP is unlikely to succeed.
>> >> >>> >> >> > I think many applications are happy with a single-process
>> >> >>> >> >> > thread
>> >> >>> >> >> > model
>> >> >>> >> >> > where all memory is shared and pointers can be shared
>> >> >>> >> >> > freely.
>> >> >>> >> >> > I hear of some applications that require multi-process
>> >> >>> >> >> > thread
>> >> >>> >> >> > model,
>> >> >>> >> >> > I
>> >> >>> >> >> > expect that those applications also want to be able to
>> >> >>> >> >> > share
>> >> >>> >> >> > memory
>> >> >>> >> >> > and
>> >> >>> >> >> > pointers freely between them, at least memory that was
>> >> >>> >> >> > specifically
>> >> >>> >> >> > allocated to be shared (so called shared memory regions,
>> >> >>> >> >> > what's
>> >> >>> >> >> > otherwise
>> >> >>> >> >> > the purpose of such memory regions?).
>> >> >>> >> >> >
>> >> >>> >> >> > Barry: Disagree with the same comments as in S19.
>> >> >>> >> >> >
>> >> >>> >> >> > Bill: I believe my discourse on S19 completely resolves
>> >> >>> >> >> > this
>> >> >>> >> >> > question.
>> >> >>> >> >> > This
>> >> >>> >> >> > is controlled by the share flag specified at
>> >> >>> >> >> > odp_shm_reserve()
>> >> >>> >> >> > time.
>> >> >>> >> >> > We
>> >> >>> >> >> > just need to specify the sharing scope implied by each of
>> >> >>> >> >> > these
>> >> >>> >> >> > and
>> >> >>> >> >> > then
>> >> >>> >> >> > it
>> >> >>> >> >> > is up to each implementation to see that such scope is
>> >> >>> >> >> > realized.
>> >> >>> >> >> >
>> >> >>> >> >> > ---------------------
>> >> >>> >> >> >
>> >> >>> >> >> > S21 -open: shm will support and extra flag at shm_reserve()
>> >> >>> >> >> > call
>> >> >>> >> >> > time:
>> >> >>> >> >> > SHM_XXX. The usage of this flag will allocate shared memory
>> >> >>> >> >> > guaranteed
>> >> >>> >> >> > to be located at the same virtual address on all odpthreads
>> >> >>> >> >> > of
>> >> >>> >> >> > the
>> >> >>> >> >> > odp_instance. Pointers to this shared memory type are
>> >> >>> >> >> > therefore
>> >> >>> >> >> > fully
>> >> >>> >> >> > sharable, even on odpthreads running on different VA space
>> >> >>> >> >> > (e.g.
>> >> >>> >> >> > processes). The amount of memory which can be allocated
>> >> >>> >> >> > using
>> >> >>> >> >> > this
>> >> >>> >> >> > flag can be
>> >> >>> >> >> > limited to any value by the ODP implementation, down to
>> >> >>> >> >> > zero
>> >> >>> >> >> > bytes,
>> >> >>> >> >> > meaning that some odp implementation may not support this
>> >> >>> >> >> > option
>> >> >>> >> >> > at
>> >> >>> >> >> > all. The shm_reserve() will return an error in this
>> >> >>> >> >> > case.The
>> >> >>> >> >> > usage of
>> >> >>> >> >> > this flag by the application is therefore not recommended.
>> >> >>> >> >> > The
>> >> >>> >> >> > ODP
>> >> >>> >> >> > implementation may require a hint about the size of this
>> >> >>> >> >> > area
>> >> >>> >> >> > at
>> >> >>> >> >> > odp_init_global() call time.
>> >> >>> >> >> >
>> >> >>> >> >> > Barry: Mostly agree, except for the comment about the
>> >> >>> >> >> > special
>> >> >>> >> >> > flag
>> >> >>> >> >> > not
>> >> >>> >> >> > being recommended.
>> >> >>> >> >> >
>> >> >>> >> >> > Ola: Agree. Some/many applications will want to share
>> >> >>> >> >> > memory
>> >> >>> >> >> > between
>> >> >>> >> >> > threads/processes and must be able to do so. Some ODP
>> >> >>> >> >> > platforms
>> >> >>> >> >> > may
>> >> >>> >> >> > have
>> >> >>> >> >> > limitations to the amount of memory (if any) that can be
>> >> >>> >> >> > shared
>> >> >>> >> >> > and
>> >> >>> >> >> > may
>> >> >>> >> >> > thus fail to run certain applications. Such is life. I
>> >> >>> >> >> > don't
>> >> >>> >> >> > see
>> >> >>> >> >> > a
>> >> >>> >> >> > problem
>> >> >>> >> >> > with that. Possibly we should remove the phrase "not
>> >> >>> >> >> > recommended"
>> >> >>> >> >> > and
>> >> >>> >> >> > just
>> >> >>> >> >> > state that portability may be limited.
>> >> >>> >> >> >
>> >> >>> >> >> >
>> >> >>> >> >> > Bill: Yes. As noted in S19 and S20 the intent of the share
>> >> >>> >> >> > flag
>> >> >>> >> >> > is to
>> >> >>> >> >> > specify desired addressability scope for the returned
>> >> >>> >> >> > odp_shm_t.
>> >> >>> >> >> > It's
>> >> >>> >> >> > perfectly reasonable to define multiple such scopes that
>> >> >>> >> >> > may
>> >> >>> >> >> > have
>> >> >>> >> >> > different
>> >> >>> >> >> > intended uses (and implementation costs).
>> >> >>> >> >> >
>> >> >>> >> >> > ------------------
>> >> >>> >> >> >
>> >> >>> >> >> > S22 -open: please put here your name suggestions for this
>> >> >>> >> >> > SHM_XXX
>> >> >>> >> >> > flag
>> >> >>> >> >> > :-).
>> >> >>> >> >> >
>> >> >>> >> >> > Ola:
>> >> >>> >> >> > SHM_I_REALLY_WANT_TO_SHARE_THIS_MEMORY
>> >> >>> >> >> >
>> >> >>> >> >> > Bill: I previously suggested ODP_SHM_INSTANCE that
>> >> >>> >> >> > specifies
>> >> >>> >> >> > that
>> >> >>> >> >> > the
>> >> >>> >> >> > sharing scope of this odp_shm_t is the entire ODP instance.
>> >> >>> >> >> >
>> >> >>> >> >> > ------------------
>> >> >>> >> >> >
>> >> >>> >> >> > S23 -open: The rules above define relatively well the
>> >> >>> >> >> > behaviour
>> >> >>> >> >> > of
>> >> >>> >> >> > pointer retrieved by the call to odp_shm_get_addr(). But
>> >> >>> >> >> > many
>> >> >>> >> >> > points
>> >> >>> >> >> > needs tobe defined regarding other ODP objects pointers:
>> >> >>> >> >> > What
>> >> >>> >> >> > is
>> >> >>> >> >> > the
>> >> >>> >> >> > validity of a pointer to a packet, for instance? If process
>> >> >>> >> >> > A
>> >> >>> >> >> > creates
>> >> >>> >> >> > a packet pool P, then forks B and C, and B allocate a
>> >> >>> >> >> > packet
>> >> >>> >> >> > from
>> >> >>> >> >> > P
>> >> >>> >> >> > and retrieves a pointer to a packet allocated from this
>> >> >>> >> >> > P...
>> >> >>> >> >> > Is
>> >> >>> >> >> > this
>> >> >>> >> >> > pointer valid in A and C? In the current l-g
>> >> >>> >> >> > implementation,
>> >> >>> >> >> > it
>> >> >>> >> >> > will... Is this behaviour
>> >> >>> >> >> > something we wish to enforce on any odp implementation?
>> >> >>> >> >> > What
>> >> >>> >> >> > about
>> >> >>> >> >> > other objects: buffers, atomics ... Some clear rule has to
>> >> >>> >> >> > be
>> >> >>> >> >> > defined
>> >> >>> >> >> > here... How things behave and if this behaviour is a part
>> >> >>> >> >> > of
>> >> >>> >> >> > the
>> >> >>> >> >> > ODP
>> >> >>> >> >> > API or just specific to different implementations...
>> >> >>> >> >> >
>> >> >>> >> >> > Ola: Perhaps we need the option to specify the
>> >> >>> >> >> > I_REALLY_WANT_TO_SHARE_THIS_MEMORY flag when creating all
>> >> >>> >> >> > types
>> >> >>> >> >> > of
>> >> >>> >> >> > ODP
>> >> >>> >> >> > pools?
>> >> >>> >> >> > An ODP implementation can always fail to create such a pool
>> >> >>> >> >> > if
>> >> >>> >> >> > the
>> >> >>> >> >> > sharability requirement can not be satisfied.
>> >> >>> >> >> > Allocation of locations used for atomic operations is the
>> >> >>> >> >> > responsibility
>> >> >>> >> >> > of the application which can (and must) choose a suitable
>> >> >>> >> >> > type
>> >> >>> >> >> > of
>> >> >>> >> >> > memory.
>> >> >>> >> >> > It is better that sharability is an explicit requirement
>> >> >>> >> >> > from
>> >> >>> >> >> > the
>> >> >>> >> >> > application. It should be specified as a flag parameter to
>> >> >>> >> >> > the
>> >> >>> >> >> > different
>> >> >>> >> >> > calls that create/allocate regions of memory (shmem,
>> >> >>> >> >> > different
>> >> >>> >> >> > types
>> >> >>> >> >> > of
>> >> >>> >> >> > pools).
>> >> >>> >> >> >
>> >> >>> >> >> >
>> >> >>> >> >> > Barry:
>> >> >>> >> >> > Again refer to S19 answer.  Specifically it is about what
>> >> >>> >> >> > is
>> >> >>> >> >> > GUARANTEED regarding
>> >> >>> >> >> > pointer validity, not whether the pointers in certain cases
>> >> >>> >> >> > will
>> >> >>> >> >> > happen
>> >> >>> >> >> > to be
>> >> >>> >> >> > the same.  So for your example, the pointer is not
>> >> >>> >> >> > guaranteed
>> >> >>> >> >> > to
>> >> >>> >> >> > be
>> >> >>> >> >> > valid in A and C,
>> >> >>> >> >> > but the programmer might well believe that for all the ODP
>> >> >>> >> >> > platforms
>> >> >>> >> >> > and implementations
>> >> >>> >> >> > they expect to run on, this is very likely to be the case,
>> >> >>> >> >> > in
>> >> >>> >> >> > which
>> >> >>> >> >> > case we can’t stop them
>> >> >>> >> >> > from constraining their program’s portability – no more
>> >> >>> >> >> > than
>> >> >>> >> >> > requiring
>> >> >>> >> >> > them to be able to
>> >> >>> >> >> > port to a ternary (3-valued “bit”) architecture.
>> >> >>> >> >> >
>> >> >>> >> >>
>> >> >>> >> >>
>> >> >>> >> >> Bill: See s19 response, which answers all these questions.
>> >> >>> >> >> The
>> >> >>> >> >> fork
>> >> >>> >> >> case
>> >> >>> >> >> you mention is moot because addresses returned by ODP packet
>> >> >>> >> >> operations
>> >> >>> >> >> are
>> >> >>> >> >> only valid in the thread that makes those calls. The handle
>> >> >>> >> >> must
>> >> >>> >> >> be
>> >> >>> >> >> valid
>> >> >>> >> >> throughout the instance so that may affect how the
>> >> >>> >> >> implementation
>> >> >>> >> >> chooses
>> >> >>> >> >> to implement packet pool creation, but such questions are
>> >> >>> >> >> internal
>> >> >>> >> >> to
>> >> >>> >> >> each
>> >> >>> >> >> implementation and not part of the API.
>> >> >>> >> >>
>> >> >>> >> >> Atomics are application declared entities and so memory
>> >> >>> >> >> sharing
>> >> >>> >> >> is
>> >> >>> >> >> again
>> >> >>> >> >> covered by my response to s19. If the atomic is in storage
>> >> >>> >> >> the
>> >> >>> >> >> application
>> >> >>> >> >> allocated from the OS, then its sharability is the
>> >> >>> >> >> responsibility
>> >> >>> >> >> of
>> >> >>> >> >> the
>> >> >>> >> >> application. If the atomic is part of a struct that resides
>> >> >>> >> >> in
>> >> >>> >> >> an
>> >> >>> >> >> odp_shm_t, then its sharability is governed by the share flag
>> >> >>> >> >> of
>> >> >>> >> >> the
>> >> >>> >> >> odp_shm it is taken from.
>> >> >>> >> >>
>> >> >>> >> >> No additional parameters are required for pools because what
>> >> >>> >> >> is
>> >> >>> >> >> shared
>> >> >>> >> >> from
>> >> >>> >> >> them is handles, not addresses. As noted, when these handles
>> >> >>> >> >> are
>> >> >>> >> >> converted
>> >> >>> >> >> to addresses those addresses are only valid for the calling
>> >> >>> >> >> thread.
>> >> >>> >> >> All
>> >> >>> >> >> other threads that have access to the handle make their own
>> >> >>> >> >> calls
>> >> >>> >> >> and
>> >> >>> >> >> whether or not those two addresses have any relationship to
>> >> >>> >> >> each
>> >> >>> >> >> other
>> >> >>> >> >> is
>> >> >>> >> >> not visible to the application.
>> >> >>> >> >>
>> >> >>> >> >>
>> >> >>> >> >> >
>> >> >>> >> >>
>> >> >>> >> >> > ---------------------
>> >> >>> >> >> >
>> >> >>> >> >> > Thanks for your feedback!
>> >> >>> >> >> >
>> >> >>> >> >> >
>> >> >>> >> >> >
>> >> >>> >> >> >
>> >> >>> >> >> _______________________________________________
>> >> >>> >> >> lng-odp mailing list
>> >> >>> >> >> lng-odp@lists.linaro.org
>> >> >>> >> >> https://lists.linaro.org/mailman/listinfo/lng-odp
>> >> >>> >> >
>> >> >>> >> >
>> >> >>> >
>> >> >>> >
>> >> >>
>> >> >>
>> >> >
>> >
>> >
>
>
_______________________________________________
lng-odp mailing list
lng-odp@lists.linaro.org
https://lists.linaro.org/mailman/listinfo/lng-odp

Re: [lng-odp] thread/shmem discussion summary V4

Reply via email to