Thanks Jerin, I have merged your comments into the doc: there were many points you did not comments (not sure whether the yes below a group applied to the group of statement). You can review the doc to make sure I did not corrupt your feelings :-)
once again, Thanks Christophe. On 8 June 2016 at 14:11, Jerin Jacob <jerin.ja...@caviumnetworks.com> wrote: > On Fri, Jun 03, 2016 at 11:15:43AM +0200, Christophe Milard wrote: >> since V3: Update following Bill's comments >> since V2: Update following Barry and Bill's comments >> since V1: Update following arch call 31 may 2016 >> >> This is a tentative to sum up the discussions around the thread/process >> that have been happening these last weeks. >> Sorry for the formalism of this mail, but it seems we need accuracy >> here... >> >> This summary is organized as follows: >> >> It is a set of statements, each of them expecting a separate answer >> from you. When no specific ODP version is specified, the statement >> regards the"ultimate" goal (i.e what we want eventually to achieve). >> Each statement is prefixed with: >> - a statement number for further reference (e.g. S1) >> - a status word (one of 'agreed' or 'open', or 'closed'). >> Agreed statements expect a yes/no answers: 'yes' meaning that you >> acknowledge that this is your understanding of the agreement and will >> not nack an implementation based on this statement. You can comment >> after a yes, but your comment will not block any implementation based >> on the agreed statement. A 'no' implies that the statement does not >> reflect your understanding of the agreement, or you refuse the >> proposal. >> Any 'no' received on an 'agreed' statement will push it back as 'open'. >> Open statements are fully open for further discussion. >> >> S1 -agreed: an ODP thread is an OS/platform concurrent execution >> environment object (as opposed to an ODP objects). No more specific >> definition is given by the ODP API itself. >> >> Barry: YES >> Bill: Yes > > Jerin: Yes > >> >> --------------------------- >> >> S2 -agreed: Each ODP implementation must tell what is allowed to be >> used as ODP thread for that specific implementation: a linux-based >> implementation, for instance, will have to state whether odp threads >> can be linux pthread, linux processes, or both, or any other type of >> concurrent execution environment. ODP implementations can put any >> restriction they wish on what an ODP thread is allowed to be. This >> should be documented in the ODP implementation documentation. >> >> Barry: YES >> Bill: Yes > > Jerin: Yes, Apart from Linux process or thread based implementation, an > implementation may select to have bare-metal execution environment(i.e > without any OS) > >> >> --------------------------- >> >> S3 -agreed: in the linux generic ODP implementation a odpthread will be >> either: >> * a linux process descendant (or same as) the odp instantiation >> process. >> * a pthread 'member' of a linux process descendant (or same >> as) the odp instantiation process. >> >> Barry: YES >> Bill: Yes >> >> --------------------------- >> >> S4 -agreed: For monarch, the linux generic ODP implementation only >> supports odp thread as pthread member of the instantiation process. >> >> Barry: YES >> Bill: Yes >> >> --------------------------- >> >> S5 -agreed: whether multiple instances of ODP can be run on the same >> machine is left as a implementation decision. The ODP implementation >> document should state what is supported and any restriction is allowed. >> >> Barry: YES >> Bill: Yes > > Jerin: Yes > >> >> --------------------------- >> >> S6 -agreed: The l-g odp implementation will support multiple odp >> instances whose instantiation processes are different and not >> ancestor/descendant of each others. Different instances of ODP will, >> of course, be restricted in sharing common OS ressources (The total >> amount of memory available for each ODP instances may decrease as the >> number of instances increases, the access to network interfaces will >> probably be granted to the first instance grabbing the interface and >> denied to others... some other rule may apply when sharing other >> common ODP ressources.) >> >> Bill: Yes >> >> --------------------------- >> >> S7 -agreed: the l-g odp implementation will not support multiple ODP >> instances initiated from the same linux process (calling >> odp_init_global() multiple times). >> As an illustration, This means that a single process P is not allowed >> to execute the following calls (in any order) >> instance1 = odp_init_global() >> instance2 = odp_init_global() >> pthread_create (and, in that thread, run odp_local_init(instance1) ) >> pthread_create (and, in that thread, run odp_local_init(instance2) ) >> >> Bill: Yes >> >> ------------------- >> >> S8 -agreed: the l-g odp implementation will not support multiple ODP >> instances initiated from related linux processes (descendant/ancestor >> of each other), hence enabling ODP 'sub-instance'? As an illustration, >> this means that the following is not supported: >> instance1 = odp_init_global() >> pthread_create (and, in that thread, run odp_local_init(instance1) ) >> if (fork()==0) { >> instance2 = odp_init_global() >> pthread_create (and, in that thread, run odp_local_init(instance2) ) >> } >> >> Bill: Yes >> >> -------------------- >> >> S9 -agreed: the odp instance passed as parameter to odp_local_init() >> must always be one of the odp_instance returned by odp_global_init() >> >> Barry: YES >> Bill: Yes >> >> --------------------------- >> >> S10 -agreed: For l-g, if the answer to S7 and S8 are 'yes', then due to S3, >> the odp_instance an odp_thread can attach to is completely defined by >> the ancestor of the thread, making the odp_instance parameter of >> odp_init_local redundant. The odp l-g implementation guide will >> enlighten this >> redundancy, but will stress that even in this case the parameter to >> odp_local_init() still have to be set correctly, as its usage is >> internal to the implementation. >> >> Barry: I think so >> Bill: This practice also ensures that applications behave unchanged if >> and when multi-instance support is added, so I don't think we need to >> be apologetic about this parameter requirement. >> >> --------------------------- >> >> S11 -agreed: at odp_global_init() time, the application will provide 3 >> sets of cpu (i.e 3 cpu masks): >> -the control cpu mask >> -the worker cpu mask >> -the odp service cpu mask (i.e the set of cpu odp can take for >> its own usage) >> Note: The service CPU mask will be introdused post monarch >> >> Bill: Yes >> Barry: YES >> --------------------------- >> >> S12 -agreed: the odp implementation may return an error at >> odp_init_global() call if the number of cpu in the odp service mask >> (or their 'position') does not match the ODP implementation need. >> >> Barry: YES >> Bill: Yes. However, an implementation may fail an odp_init_global() call >> for any resource insufficiency, not just cpus. > > Jerin: Yes > >> >> >> --------------------------- >> >> S13 -agreed: the application is fully responsible of pinning its own >> odp threads to different cpus, and this is done directly through OS >> system calls, or via helper functions (as opposed to ODP API calls). >> This pinning should be made among cpus member of the worker cpu mask >> or the control cpu mask. >> >> Barry: YES, but I support the existence of helper functions to do this >> – including the >> important case of pinning the main thread >> >> Bill: Yes. And agree an ODP helper is useful here (which is why odp-linux >> provides one). > > Jerin: YES, Helper function makes much sense in bare metal environment. > >> >> --------------------------- >> >> S14 -agreed: whether more than one odp thread can be pinned to the >> same cpu is left as an implementation choice (and the answer to that >> question can be different for the service, worker and control >> threads). This choice should be well documented in the implementation >> user manual. >> >> Barry: YES >> Bill: Yes > > Jerin: Yes > >> >> --------------------------- >> >> S15 -agreed: the odp implementation is responsible of pinning its own >> service threads among the cpu member of the odp service cpu mask. >> >> Barry: YES, in principle – BUT be aware that currently the l-g ODP >> implementation >> (and perhaps many others) cannot call the helper functions (unless >> inlined), >> so this internal pinning may not be well coordinated with the helpers. >> >> Bill: Yes. And I agree with Barry on the helper recursion issue. We should >> fix that so there is no conflict between implementation internal pinning >> and application pinning attempts. > > Jerin: The way other data plane models achieve this by sharing the > service as an API. It is up to the application to decide > on which core it needs to run. > >> >> --------------------------- >> >> S16 -open: why does the odp implementation need to know the control and >> worker mask? If S13 is true, shoudln't these two masks be part of the >> helper only? (meaning that S11 is wrong) >> >> Barry: Currently it probably doesn’t NEED them, but perhaps in the >> future, with some >> new API’s and capabilities, it might benefit from this information, >> and so I would leave them in. >> >> Bill: The implementation sees these because they are how a provisioning >> agent (e.g., OpenDaylight) would pass higher-level configuration >> information through the application to the underlying ODP implementation. >> The initial masks specified on odp_init_global() are used in the >> implementation of the odp_cpumask_default_worker(), >> odp_cpumask_default_control(), and odp_cpumask_all_available() APIs. > > Jerin: IMO, Control thread may not be 1:1 mapped to the core. > And certain HW architecture, HW resources are bound particular cores. > So given that kind > architectures, it make sense to defined control and worker thread > by implementation. > >> >> --------------------------- >> >> S17 -open: should masks passed as parameter to odp_init_global() have the >> same "namespace" as those used internally within ODP? >> >> Barry: YES >> Bill: Yes. I'm not sure what it would mean for them to be in a different >> namespace. How would those be bridged if they weren't? >> >> >> --------------------------- >> >> S18 -agreed: ODP handles are valid over the whole ODP instance, i.e. >> any odp handle remains valid among all the odpthreads of the ODP >> instance regardless of the odp thread type (process, thread or >> whatever): an ODP thread A can pass an odp handle to onother ODP >> thread B (using any kind of IPC), and B can use the handle. >> >> Bill: Yes > > Jerin: Yes, But multiple process implementation it may be > expected to get the handlers with odp lookup functions. > >> >> ----------------- >> >> S19 -open : any pointer retrieved by an ODP call (such as >> odp_*_get_addr()) follows the rules defined by the OS, with the >> possible exception defined in S21. For the linux generic ODP >> implementation, this means that >> pointers are fully shareable when using pthreads and that pointers >> pointing to shared mem areas will be shareable as long as the fork() >> happens after the shm_reserve(). >> >> Barry: NO. Disagree. I would prefer to see a consistent ODP answer on >> this topic, and in >> particular I don’t even believe that most OS’s “have rules defining …”, >> since >> in fact one can make programs run under Linux which can share pointers >> regardless >> the ordering of fork() calls. Most OS have lots of (continually >> evolving) capabilities >> in the category of sharing memory and so “following the rules of the OS” >> is not >> well defined. >> Instead, I prefer a simpler rule. Memory reserved using the special >> flag is guaranteed >> to use the same addresses across processes, and all other pointers are >> not guaranteed >> to be the same nor guaranteed to be different, so the ODP programmer >> should avoid >> any such assumptions for maximum portability. But of course programmers >> often >> only consider a subset of possible targets (e.g. how many programmers >> consider porting >> to an 8-bit CPU or a machine with a 36-bit word length), and so they >> may happily take advantage >> of certain non-guaranteed assumptions. >> >> >> Bill: As I noted earlier we have to distinguish between different types of >> memory and where these pointers come from. If the application is using >> malloc() or some other OS API to get memory and then using that memory's >> address as, for example, a queue context pointer, then it is taking >> responsibility for ensuring that these pointers are meaningful to whoever >> sees them. ODP isn't going to do anything to help there. So this question >> really only refers to addresses returned from ODP APIs. If we look for void >> * returns in the ODP API we see that the only addresses ODP returns are: >> >> 1) Those that enable addressability to buffers and packets >> (odp_buffer_addr(), odp_packet_data(), odp_packet_offset(), etc.) >> >> These addresses are intended to be used within the scope of the calling >> thread and should not be assumed to have any validity outside of that >> context because the buffer/packet is the durable object and any addresses >> are just (potentially transient) mappings of that object for use by that >> thread. Packet and buffer handles (not addresses) are passed between >> threads via queues and the receiver issues its own such calls on the >> received handles to get its own addressability to these objects. Whether or >> not these addresses are the same is purely internal to an ODP >> implementation and is not visible to the application. >> >> 2) Packet user areas (odp_packet_user_area()). This API returns the >> address of a preallocated user area associated with the packet (size >> defined by the pool that the packet was drawn from at odp_pool_create() >> time by the max_uarea_size entry in the odp_pool_param_t). Since this is >> metadata associated with the packet this API may be called by any thread >> that obtains the odp_packet_t for the packet that contains that user area. >> However, like the packet itself, the scope of this returned address is the >> calling thread. So the address returned by odp_packet_user_area() should >> not be cached or passed to any other thread. Each thread that needs >> addressability to this area makes its own call and whether these returned >> addresses are the same or different is again internal to the implementation >> and not visible to the application. Note that just as two threads should >> not ownership of an odp_packet_t at the same time, two threads should not >> be trying to access the user area associated with a packet either. >> >> 3) Context pointer getters (odp_queue_context(), odp_packet_user_ptr(), >> odp_timeout_user_ptr(), odp_tm_node_context(), odp_tm_queue_context(), and >> the user context pointer carried in the odp_crypto_params_t struct) >> >> These are set by the application using corresponding setter APIs or >> provided as values in structs, so the application either obtains these >> pointers on its own, in which case it is responsible for ensuring that they >> are meaningful to whoever retrieves them, or from an odp_shm_t. So these >> are not a special case in themselves. >> >> 4) ODP shared memory (odp_shm_addr(), odp_shm_info()). These APIs return >> addresses to odp_shm_t objects that are specifically created to support >> sharing. The rule here is simple: the scope of any returned shm address is >> determined by the sharing flag specified at odp_shm_reserve() time. ODP >> currently defines two such flags: ODP_SHM_SW_ONLY and ODP_SHM_PROC. We >> simply need to define precisely the intended sharing scope of these two (or >> any new flags we define) to answer this question. Note that context >> pointers drawn from odp_shm_t objects would then have whatever sharing >> attributes that the shm object has, thus completely defining case (3). >> > > Jerin: Agree with Bill. Apart from four different type of address > mentioned above, may be odp handles can be also a pointer. So it makes > sense to resolve the handler across the odp instance with odp handle > lookup APIs. > >> --------------------- >> >> S20 -open: by default, shmem addresses (returned by odp_shm_addr()) >> follow the OS rules, as defined by S19. >> >> Ola: The question is which OS rules apply (an OS can have different rules for >> different OS objects, e.g. memory regions allocated using malloc and mmap >> will behave differently). I think the answer depends on ODP shmem objects >> are implemented. Only the ODP implementation knows how ODP shmem objects >> are created (e.g. use some OS system call, manipulate the page tables >> directly). So essentially the sharability of pointers is ODP implementation >> specific (although ODP implementations on the same OS can be expected to >> behave the same). Conclusion: we actually don't specify anything at all >> here, it is completely up to the ODP implementation. >> What is required/expected by ODP applications? If we don't make >> applications happy, ODP is unlikely to succeed. >> I think many applications are happy with a single-process thread model >> where all memory is shared and pointers can be shared freely. >> I hear of some applications that require multi-process thread model, I >> expect that those applications also want to be able to share memory and >> pointers freely between them, at least memory that was specifically >> allocated to be shared (so called shared memory regions, what's otherwise >> the purpose of such memory regions?). >> >> Barry: Disagree with the same comments as in S19. >> >> Bill: I believe my discourse on S19 completely resolves this question. This >> is controlled by the share flag specified at odp_shm_reserve() time. We >> just need to specify the sharing scope implied by each of these and then it >> is up to each implementation to see that such scope is realized. >> >> --------------------- >> >> S21 -open: shm will support and extra flag at shm_reserve() call time: >> SHM_XXX. The usage of this flag will allocate shared memory guaranteed >> to be located at the same virtual address on all odpthreads of the >> odp_instance. Pointers to this shared memory type are therefore fully >> sharable, even on odpthreads running on different VA space (e.g. >> processes). The amount of memory which can be allocated using this >> flag can be >> limited to any value by the ODP implementation, down to zero bytes, >> meaning that some odp implementation may not support this option at >> all. The shm_reserve() will return an error in this case.The usage of >> this flag by the application is therefore not recommended. The ODP >> implementation may require a hint about the size of this area at >> odp_init_global() call time. >> >> Barry: Mostly agree, except for the comment about the special flag not >> being recommended. >> >> Ola: Agree. Some/many applications will want to share memory between >> threads/processes and must be able to do so. Some ODP platforms may have >> limitations to the amount of memory (if any) that can be shared and may >> thus fail to run certain applications. Such is life. I don't see a problem >> with that. Possibly we should remove the phrase "not recommended" and just >> state that portability may be limited. >> >> >> Bill: Yes. As noted in S19 and S20 the intent of the share flag is to >> specify desired addressability scope for the returned odp_shm_t. It's >> perfectly reasonable to define multiple such scopes that may have different >> intended uses (and implementation costs). >> >> ------------------ >> >> S22 -open: please put here your name suggestions for this SHM_XXX flag >> :-). >> >> Ola: >> SHM_I_REALLY_WANT_TO_SHARE_THIS_MEMORY >> >> Bill: I previously suggested ODP_SHM_INSTANCE that specifies that the >> sharing scope of this odp_shm_t is the entire ODP instance. >> >> ------------------ >> >> S23 -open: The rules above define relatively well the behaviour of >> pointer retrieved by the call to odp_shm_get_addr(). But many points >> needs tobe defined regarding other ODP objects pointers: What is the >> validity of a pointer to a packet, for instance? If process A creates >> a packet pool P, then forks B and C, and B allocate a packet from P >> and retrieves a pointer to a packet allocated from this P... Is this >> pointer valid in A and C? In the current l-g implementation, it >> will... Is this behaviour >> something we wish to enforce on any odp implementation? What about >> other objects: buffers, atomics ... Some clear rule has to be defined >> here... How things behave and if this behaviour is a part of the ODP >> API or just specific to different implementations... >> >> Ola: Perhaps we need the option to specify the >> I_REALLY_WANT_TO_SHARE_THIS_MEMORY flag when creating all types of ODP >> pools? >> An ODP implementation can always fail to create such a pool if the >> sharability requirement can not be satisfied. >> Allocation of locations used for atomic operations is the responsibility >> of the application which can (and must) choose a suitable type of memory. >> It is better that sharability is an explicit requirement from the >> application. It should be specified as a flag parameter to the different >> calls that create/allocate regions of memory (shmem, different types of >> pools). >> >> >> Barry: >> Again refer to S19 answer. Specifically it is about what is >> GUARANTEED regarding >> pointer validity, not whether the pointers in certain cases will happen >> to be >> the same. So for your example, the pointer is not guaranteed to be >> valid in A and C, >> but the programmer might well believe that for all the ODP platforms >> and implementations >> they expect to run on, this is very likely to be the case, in which >> case we can’t stop them >> from constraining their program’s portability – no more than requiring >> them to be able to >> port to a ternary (3-valued “bit”) architecture. >> >> >> >> --------------------- >> >> Thanks for your feedback! >> >> >> >> _______________________________________________ >> lng-odp mailing list >> lng-odp@lists.linaro.org >> https://lists.linaro.org/mailman/listinfo/lng-odp _______________________________________________ lng-odp mailing list lng-odp@lists.linaro.org https://lists.linaro.org/mailman/listinfo/lng-odp