I got the impression that ODP MBUS API would define a transport protocol/API between an ODP application and a control plane application, like TCP is the transport protocol for HTTP applications (e.g Web). Netlink defines exactly that - transport protocol for configuration messages. Maxim asked about the messages - should applications define the message format and/or the message content? Wouldn't be an easier task for the application to define only the content and let ODP to define a format? Reliability could be an issue but Netlink spec says how applications can create reliable protocols:
One could create a reliable protocol between an FEC and a CPC by using the combination of sequence numbers, ACKs, and retransmit timers. Both sequence numbers and ACKs are provided by Netlink; timers are provided by Linux. One could create a heartbeat protocol between the FEC and CPC by using the ECHO flags and the NLMSG_NOOP message. On 21 May 2015 at 16:23, Ola Liljedahl <ola.liljed...@linaro.org> wrote: > On 21 May 2015 at 15:05, Alexandru Badicioiu < > alexandru.badici...@linaro.org> wrote: > >> I was referring to the Netlink protocol in itself, as a model for ODP >> MBUS (or IPC). >> > Isn't the Netlink protocol what the endpoints send between them? This is > not specified by the ODP IPC/MBUS API, applications can define or re-use > whatever protocol they like. The protocol definition is heavily dependent > on what you actually use the IPC for and we shouldn't force ODP users to > use some specific predefined protocol. > > Also the "wire protocol" is left undefined, this is up to the > implementation to define and each platform can have its own definition. > > And netlink isn't even reliable. I know that that creates problems, e.g. > impossible to get a clean and complete snapshot of e.g. the routing table. > > >> The interaction between the FEC and the CPC, in the Netlink context, >> defines a protocol. Netlink provides mechanisms for the CPC >> (residing in user space) and the FEC (residing in kernel space) to >> have their own protocol definition -- *kernel space and user space >> just mean different protection domains*. Therefore, a wire protocol >> is needed to communicate. The wire protocol is normally provided by >> some privileged service that is able to copy between multiple >> protection domains. We will refer to this service as the Netlink >> service. The Netlink service can also be encapsulated in a different >> transport layer, if the CPC executes on a different node than the >> FEC. The FEC and CPC, using Netlink mechanisms, may choose to define >> a reliable protocol between each other. By default, however, Netlink >> provides an unreliable communication. >> >> Note that the FEC and CPC can both live in the same memory protection >> domain and use the connect() system call to create a path to the peer >> and talk to each other. We will not discuss this mechanism further >> other than to say that it is available. Throughout this document, we >> will refer interchangeably to the FEC to mean kernel space and the >> CPC to mean user space. This denomination is not meant, however, to >> restrict the two components to these protection domains or to the >> same compute node. >> >> >> >> On 21 May 2015 at 15:55, Ola Liljedahl <ola.liljed...@linaro.org> wrote: >> >>> On 21 May 2015 at 13:22, Alexandru Badicioiu < >>> alexandru.badici...@linaro.org> wrote: >>> > Hi, >>> > would Netlink protocol (https://tools.ietf.org/html/rfc3549) fit the >>> purpose >>> > of ODP IPC (within a single OS instance)? >>> I interpret this as a question whether Netlink would be fit as an >>> implementation of the ODP IPC (now called message bus because "IPC" is so >>> contended and imbued with different meanings). >>> >>> It is perhaps possible. Netlink seems a bit focused on intra-kernel and >>> kernel-to-user while the ODP IPC-MBUS is focused on user-to-user >>> (application-to-application). >>> >>> I see a couple of primary requirements: >>> >>> - Support communication (message exchange) between user space >>> processes. >>> - Support arbitrary used-defined messages. >>> - Ordered, reliable delivery of messages. >>> >>> >>> From the little I can quickly read up on Netlink, the first two >>> requirements do not seem supported. But perhaps someone with more intimate >>> knowledge of Netlink can prove me wrong. Or maybe Netlink can be extended >>> to support u2u and user-defined messages, the current specialization (e.g. >>> specialized addressing, specialized message formats) seems contrary to the >>> goals of providing generic mechanisms in the kernel that can be used for >>> different things. >>> >>> My IPC/MBUS reference implementation for linux-generic builds upon POSIX >>> message queues. One of my issues is that I want the message queue >>> associated with a process to go away when the process goes away. The >>> message queues are not independent entities. >>> >>> -- Ola >>> >>> > >>> > Thanks, >>> > Alex >>> > >>> > On 21 May 2015 at 14:12, Ola Liljedahl <ola.liljed...@linaro.org> >>> wrote: >>> >> >>> >> On 21 May 2015 at 11:50, Savolainen, Petri (Nokia - FI/Espoo) >>> >> <petri.savolai...@nokia.com> wrote: >>> >> > >>> >> > >>> >> >> -----Original Message----- >>> >> >> From: lng-odp [mailto:lng-odp-boun...@lists.linaro.org] On Behalf >>> Of >>> >> >> ext >>> >> >> Ola Liljedahl >>> >> >> Sent: Tuesday, May 19, 2015 1:04 AM >>> >> >> To: lng-odp@lists.linaro.org >>> >> >> Subject: [lng-odp] [RFC] Add ipc.h >>> >> >> >>> >> >> As promised, here is my first attempt at a standalone API for IPC - >>> >> >> inter >>> >> >> process communication in a shared nothing architecture (message >>> passing >>> >> >> between processes which do not share memory). >>> >> >> >>> >> >> Currently all definitions are in the file ipc.h but it is possible >>> to >>> >> >> break out some message/event related definitions (everything from >>> >> >> odp_ipc_sender) in a separate file message.h. This would mimic the >>> >> >> packet_io.h/packet.h separation. >>> >> >> >>> >> >> The semantics of message passing is that sending a message to an >>> >> >> endpoint >>> >> >> will always look like it succeeds. The appearance of endpoints is >>> >> >> explicitly >>> >> >> notified through user-defined messages specified in the >>> >> >> odp_ipc_resolve() >>> >> >> call. Similarly, the disappearance (e.g. death or otherwise lost >>> >> >> connection) >>> >> >> is also explicitly notified through user-defined messages >>> specified in >>> >> >> the >>> >> >> odp_ipc_monitor() call. The send call does not fail because the >>> >> >> addressed >>> >> >> endpoints has disappeared. >>> >> >> >>> >> >> Messages (from endpoint A to endpoint B) are delivered in order. If >>> >> >> message >>> >> >> N sent to an endpoint is delivered, then all messages <N have also >>> been >>> >> >> delivered. Message delivery does not guarantee actual processing >>> by the >>> >> > >>> >> > Ordered is OK requirement, but "all messages <N have also been >>> >> > delivered" means in practice loss less delivery (== re-tries and >>> >> > retransmission windows, etc). Lossy vs loss less link should be an >>> >> > configuration option. >>> >> I am just targeting internal communication which I expect to be >>> >> reliable. There is not any physical "link" involved. If an >>> >> implementation chooses to use some unreliable media, then it will need >>> >> to take some counter measures. Any loss of message could be detected >>> >> using sequence numbers (and timeouts) and handled by (temporary) >>> >> disconnection (so that no more messages will be delivered should one >>> >> go missing). >>> >> >>> >> I am OK with adding the lossless/lossy configuration to the API as >>> >> long as lossless option is always implemented. Is this a configuration >>> >> when creating the local IPC endpoint or when sending a message to >>> >> another endpoint? >>> >> >>> >> > >>> >> > Also what "delivered" means?' >>> >> > >>> >> > Message: >>> >> > - transmitted successfully over the link ? >>> >> > - is now under control of the remote node (post office) ? >>> >> > - delivered into application input queue ? >>> >> Probably this one but I am not sure the exact definition matters, "has >>> >> been delivered" or "will eventually be delivered unless connection to >>> >> the destination is lost". Maybe there is a better word than >>> >> "delivered? >>> >> >>> >> "Made available into the destination (recipient) address space"? >>> >> >>> >> > - has been dequeued from application queue ? >>> >> > >>> >> > >>> >> >> recipient. End-to-end acknowledgements (using messages) should be >>> used >>> >> >> if >>> >> >> this guarantee is important to the user. >>> >> >> >>> >> >> IPC endpoints can be seen as interfaces (taps) to an internal >>> reliable >>> >> >> multidrop network where each endpoint has a unique address which is >>> >> >> only >>> >> >> valid for the lifetime of the endpoint. I.e. if an endpoint is >>> >> >> destroyed >>> >> >> and then recreated (with the same name), the new endpoint will >>> have a >>> >> >> new address (eventually endpoints addresses will have to be >>> recycled >>> >> >> but >>> >> >> not for a very long time). Endpoints names do not necessarily have >>> to >>> >> >> be >>> >> >> unique. >>> >> > >>> >> > How widely these addresses are unique: inside one VM, multiple VMs >>> under >>> >> > the same host, multiple devices on a LAN (VLAN), ... >>> >> Currently, the scope of the name and address space is defined by the >>> >> implementation. Perhaps we should define it? My current interest is >>> >> within an OS instance (bare metal or virtualised). Between different >>> >> OS instances, I expect something based on IP to be used (because you >>> >> don't know where those different OS/VM instances will be deployed so >>> >> you need topology-independent addressing). >>> >> >>> >> Based on other feedback, I have dropped the contented usage of "IPC" >>> >> and now call it "message bus" (MBUS). >>> >> >>> >> "MBUS endpoints can be seen as interfaces (taps) to an OS-internal >>> >> reliable multidrop network"... >>> >> >>> >> > >>> >> > >>> >> >> >>> >> >> Signed-off-by: Ola Liljedahl <ola.liljed...@linaro.org> >>> >> >> --- >>> >> >> (This document/code contribution attached is provided under the >>> terms >>> >> >> of >>> >> >> agreement LES-LTM-21309) >>> >> >> >>> >> > >>> >> > >>> >> >> +/** >>> >> >> + * Create IPC endpoint >>> >> >> + * >>> >> >> + * @param name Name of local IPC endpoint >>> >> >> + * @param pool Pool for incoming messages >>> >> >> + * >>> >> >> + * @return IPC handle on success >>> >> >> + * @retval ODP_IPC_INVALID on failure and errno set >>> >> >> + */ >>> >> >> +odp_ipc_t odp_ipc_create(const char *name, odp_pool_t pool); >>> >> > >>> >> > This creates (implicitly) the local end point address. >>> >> > >>> >> > >>> >> >> + >>> >> >> +/** >>> >> >> + * Set the default input queue for an IPC endpoint >>> >> >> + * >>> >> >> + * @param ipc IPC handle >>> >> >> + * @param queue Queue handle >>> >> >> + * >>> >> >> + * @retval 0 on success >>> >> >> + * @retval <0 on failure >>> >> >> + */ >>> >> >> +int odp_ipc_inq_setdef(odp_ipc_t ipc, odp_queue_t queue); >>> >> > >>> >> > Multiple input queues are likely needed for different priority >>> messages. >>> >> > >>> >> >> + >>> >> >> +/** >>> >> >> + * Resolve endpoint by name >>> >> >> + * >>> >> >> + * Look up an existing or future endpoint by name. >>> >> >> + * When the endpoint exists, return the specified message with the >>> >> >> endpoint >>> >> >> + * as the sender. >>> >> >> + * >>> >> >> + * @param ipc IPC handle >>> >> >> + * @param name Name to resolve >>> >> >> + * @param msg Message to return >>> >> >> + */ >>> >> >> +void odp_ipc_resolve(odp_ipc_t ipc, >>> >> >> + const char *name, >>> >> >> + odp_ipc_msg_t msg); >>> >> > >>> >> > How widely these names are visible? Inside one VM, multiple VMs >>> under >>> >> > the same host, multiple devices on a LAN (VLAN), ... >>> >> > >>> >> > I think name service (or address resolution) are better handled in >>> >> > middleware layer. If ODP provides unique addresses and message >>> passing >>> >> > mechanism, additional services can be built on top. >>> >> > >>> >> > >>> >> >> + >>> >> >> +/** >>> >> >> + * Monitor endpoint >>> >> >> + * >>> >> >> + * Monitor an existing (potentially already dead) endpoint. >>> >> >> + * When the endpoint is dead, return the specified message with >>> the >>> >> >> endpoint >>> >> >> + * as the sender. >>> >> >> + * >>> >> >> + * Unrecognized or invalid endpoint addresses are treated as dead >>> >> >> endpoints. >>> >> >> + * >>> >> >> + * @param ipc IPC handle >>> >> >> + * @param addr Address of monitored endpoint >>> >> >> + * @param msg Message to return >>> >> >> + */ >>> >> >> +void odp_ipc_monitor(odp_ipc_t ipc, >>> >> >> + const uint8_t addr[ODP_IPC_ADDR_SIZE], >>> >> >> + odp_ipc_msg_t msg); >>> >> > >>> >> > Again, I'd see node health monitoring and alarms as middleware >>> services. >>> >> > >>> >> >> + >>> >> >> +/** >>> >> >> + * Send message >>> >> >> + * >>> >> >> + * Send a message to an endpoint (which may already be dead). >>> >> >> + * Message delivery is ordered and reliable. All (accepted) >>> messages >>> >> >> will >>> >> >> be >>> >> >> + * delivered up to the point of endpoint death or lost connection. >>> >> >> + * Actual reception and processing is not guaranteed (use >>> end-to-end >>> >> >> + * acknowledgements for that). >>> >> >> + * Monitor the remote endpoint to detect death or lost connection. >>> >> >> + * >>> >> >> + * @param ipc IPC handle >>> >> >> + * @param msg Message to send >>> >> >> + * @param addr Address of remote endpoint >>> >> >> + * >>> >> >> + * @retval 0 on success >>> >> >> + * @retval <0 on error >>> >> >> + */ >>> >> >> +int odp_ipc_send(odp_ipc_t ipc, >>> >> >> + odp_ipc_msg_t msg, >>> >> >> + const uint8_t addr[ODP_IPC_ADDR_SIZE]); >>> >> > >>> >> > This would be used to send a message to an address, but normal >>> >> > odp_queue_enq() could be used to circulate this event inside an >>> application >>> >> > (ODP instance). >>> >> > >>> >> >> + >>> >> >> +/** >>> >> >> + * Get address of sender (source) of message >>> >> >> + * >>> >> >> + * @param msg Message handle >>> >> >> + * @param addr Address of sender endpoint >>> >> >> + */ >>> >> >> +void odp_ipc_sender(odp_ipc_msg_t msg, >>> >> >> + uint8_t addr[ODP_IPC_ADDR_SIZE]); >>> >> >> + >>> >> >> +/** >>> >> >> + * Message data pointer >>> >> >> + * >>> >> >> + * Return a pointer to the message data >>> >> >> + * >>> >> >> + * @param msg Message handle >>> >> >> + * >>> >> >> + * @return Pointer to the message data >>> >> >> + */ >>> >> >> +void *odp_ipc_data(odp_ipc_msg_t msg); >>> >> >> + >>> >> >> +/** >>> >> >> + * Message data length >>> >> >> + * >>> >> >> + * Return length of the message data. >>> >> >> + * >>> >> >> + * @param msg Message handle >>> >> >> + * >>> >> >> + * @return Message length >>> >> >> + */ >>> >> >> +uint32_t odp_ipc_length(const odp_ipc_msg_t msg); >>> >> >> + >>> >> >> +/** >>> >> >> + * Set message length >>> >> >> + * >>> >> >> + * Set length of the message data. >>> >> >> + * >>> >> >> + * @param msg Message handle >>> >> >> + * @param len New length >>> >> >> + * >>> >> >> + * @retval 0 on success >>> >> >> + * @retval <0 on error >>> >> >> + */ >>> >> >> +int odp_ipc_reset(const odp_ipc_msg_t msg, uint32_t len); >>> >> > >>> >> > When data ptr or data len is modified: push/pull head, push/pull >>> tail >>> >> > would be analogies from packet API >>> >> > >>> >> > >>> >> > -Petri >>> >> > >>> >> > >>> >> _______________________________________________ >>> >> lng-odp mailing list >>> >> lng-odp@lists.linaro.org >>> >> https://lists.linaro.org/mailman/listinfo/lng-odp >>> > >>> > >>> >>> >> >
_______________________________________________ lng-odp mailing list lng-odp@lists.linaro.org https://lists.linaro.org/mailman/listinfo/lng-odp