Re: [lng-odp] [RFC] Add ipc.h

Alexandru Badicioiu Thu, 21 May 2015 07:07:08 -0700

I got the impression that ODP MBUS API would define a transport
protocol/API between an ODP application and a control plane application,
like TCP is the transport protocol for HTTP applications (e.g Web). Netlink
defines exactly that - transport protocol for configuration messages.
Maxim asked about the messages - should applications define the message
format and/or the message content? Wouldn't be an easier task for the
application to define only the content and let ODP to define a format?
Reliability could be an issue but Netlink spec says how applications can
create reliable protocols:



One could create a reliable protocol between an FEC and a CPC by
   using the combination of sequence numbers, ACKs, and retransmit
   timers.  Both sequence numbers and ACKs are provided by Netlink;
   timers are provided by Linux.

   One could create a heartbeat protocol between the FEC and CPC by
   using the ECHO flags and the NLMSG_NOOP message.







On 21 May 2015 at 16:23, Ola Liljedahl <ola.liljed...@linaro.org> wrote:

> On 21 May 2015 at 15:05, Alexandru Badicioiu <
> alexandru.badici...@linaro.org> wrote:
>
>> I was referring to the  Netlink protocol in itself, as a model for ODP
>> MBUS (or IPC).
>>
> Isn't the Netlink protocol what the endpoints send between them? This is
> not specified by the ODP IPC/MBUS API, applications can define or re-use
> whatever protocol they like. The protocol definition is heavily dependent
> on what you actually use the IPC for and we shouldn't force ODP users to
> use some specific predefined protocol.
>
> Also the "wire protocol" is left undefined, this is up to the
> implementation to define and each platform can have its own definition.
>
> And netlink isn't even reliable. I know that that creates problems, e.g.
> impossible to get a clean and complete snapshot of e.g. the routing table.
>
>
>> The interaction between the FEC and the CPC, in the Netlink context,
>>    defines a protocol.  Netlink provides mechanisms for the CPC
>>    (residing in user space) and the FEC (residing in kernel space) to
>>    have their own protocol definition -- *kernel space and user space
>>    just mean different protection domains*.  Therefore, a wire protocol
>>    is needed to communicate.  The wire protocol is normally provided by
>>    some privileged service that is able to copy between multiple
>>    protection domains.  We will refer to this service as the Netlink
>>    service.  The Netlink service can also be encapsulated in a different
>>    transport layer, if the CPC executes on a different node than the
>>    FEC.  The FEC and CPC, using Netlink mechanisms, may choose to define
>>    a reliable protocol between each other.  By default, however, Netlink
>>    provides an unreliable communication.
>>
>>    Note that the FEC and CPC can both live in the same memory protection
>>    domain and use the connect() system call to create a path to the peer
>>    and talk to each other.  We will not discuss this mechanism further
>>    other than to say that it is available. Throughout this document, we
>>    will refer interchangeably to the FEC to mean kernel space and the
>>    CPC to mean user space.  This denomination is not meant, however, to
>>    restrict the two components to these protection domains or to the
>>    same compute node.
>>
>>
>>
>> On 21 May 2015 at 15:55, Ola Liljedahl <ola.liljed...@linaro.org> wrote:
>>
>>> On 21 May 2015 at 13:22, Alexandru Badicioiu <
>>> alexandru.badici...@linaro.org> wrote:
>>> > Hi,
>>> > would Netlink protocol (https://tools.ietf.org/html/rfc3549) fit the
>>> purpose
>>> > of ODP IPC (within a single OS instance)?
>>> I interpret this as a question whether Netlink would be fit as an
>>> implementation of the ODP IPC (now called message bus because "IPC" is so
>>> contended and imbued with different meanings).
>>>
>>> It is perhaps possible. Netlink seems a bit focused on intra-kernel and
>>> kernel-to-user while the ODP IPC-MBUS is focused on user-to-user
>>> (application-to-application).
>>>
>>> I see a couple of primary requirements:
>>>
>>>    - Support communication (message exchange) between user space
>>>    processes.
>>>    - Support arbitrary used-defined messages.
>>>    - Ordered, reliable delivery of messages.
>>>
>>>
>>> From the little I can quickly read up on Netlink, the first two
>>> requirements do not seem supported. But perhaps someone with more intimate
>>> knowledge of Netlink can prove me wrong. Or maybe Netlink can be extended
>>> to support u2u and user-defined messages, the current specialization (e.g.
>>> specialized addressing, specialized message formats) seems contrary to the
>>> goals of providing generic mechanisms in the kernel that can be used for
>>> different things.
>>>
>>> My IPC/MBUS reference implementation for linux-generic builds upon POSIX
>>> message queues. One of my issues is that I want the message queue
>>> associated with a process to go away when the process goes away. The
>>> message queues are not independent entities.
>>>
>>> -- Ola
>>>
>>> >
>>> > Thanks,
>>> > Alex
>>> >
>>> > On 21 May 2015 at 14:12, Ola Liljedahl <ola.liljed...@linaro.org>
>>> wrote:
>>> >>
>>> >> On 21 May 2015 at 11:50, Savolainen, Petri (Nokia - FI/Espoo)
>>> >> <petri.savolai...@nokia.com> wrote:
>>> >> >
>>> >> >
>>> >> >> -----Original Message-----
>>> >> >> From: lng-odp [mailto:lng-odp-boun...@lists.linaro.org] On Behalf
>>> Of
>>> >> >> ext
>>> >> >> Ola Liljedahl
>>> >> >> Sent: Tuesday, May 19, 2015 1:04 AM
>>> >> >> To: lng-odp@lists.linaro.org
>>> >> >> Subject: [lng-odp] [RFC] Add ipc.h
>>> >> >>
>>> >> >> As promised, here is my first attempt at a standalone API for IPC -
>>> >> >> inter
>>> >> >> process communication in a shared nothing architecture (message
>>> passing
>>> >> >> between processes which do not share memory).
>>> >> >>
>>> >> >> Currently all definitions are in the file ipc.h but it is possible
>>> to
>>> >> >> break out some message/event related definitions (everything from
>>> >> >> odp_ipc_sender) in a separate file message.h. This would mimic the
>>> >> >> packet_io.h/packet.h separation.
>>> >> >>
>>> >> >> The semantics of message passing is that sending a message to an
>>> >> >> endpoint
>>> >> >> will always look like it succeeds. The appearance of endpoints is
>>> >> >> explicitly
>>> >> >> notified through user-defined messages specified in the
>>> >> >> odp_ipc_resolve()
>>> >> >> call. Similarly, the disappearance (e.g. death or otherwise lost
>>> >> >> connection)
>>> >> >> is also explicitly notified through user-defined messages
>>> specified in
>>> >> >> the
>>> >> >> odp_ipc_monitor() call. The send call does not fail because the
>>> >> >> addressed
>>> >> >> endpoints has disappeared.
>>> >> >>
>>> >> >> Messages (from endpoint A to endpoint B) are delivered in order. If
>>> >> >> message
>>> >> >> N sent to an endpoint is delivered, then all messages <N have also
>>> been
>>> >> >> delivered. Message delivery does not guarantee actual processing
>>> by the
>>> >> >
>>> >> > Ordered is OK requirement, but "all messages <N have also been
>>> >> > delivered" means in practice loss less delivery (== re-tries and
>>> >> > retransmission windows, etc). Lossy vs loss less link should be an
>>> >> > configuration option.
>>> >> I am just targeting internal communication which I expect to be
>>> >> reliable. There is not any physical "link" involved. If an
>>> >> implementation chooses to use some unreliable media, then it will need
>>> >> to take some counter measures. Any loss of message could be detected
>>> >> using sequence numbers (and timeouts) and handled by (temporary)
>>> >> disconnection (so that no more messages will be delivered should one
>>> >> go missing).
>>> >>
>>> >> I am OK with adding the lossless/lossy configuration to the API as
>>> >> long as lossless option is always implemented. Is this a configuration
>>> >> when creating the local  IPC endpoint or when sending a message to
>>> >> another endpoint?
>>> >>
>>> >> >
>>> >> > Also what "delivered" means?'
>>> >> >
>>> >> > Message:
>>> >> >  - transmitted successfully over the link ?
>>> >> >  - is now under control of the remote node (post office) ?
>>> >> >  - delivered into application input queue ?
>>> >> Probably this one but I am not sure the exact definition matters, "has
>>> >> been delivered" or "will eventually be delivered unless connection to
>>> >> the destination is lost". Maybe there is a better word than
>>> >> "delivered?
>>> >>
>>> >> "Made available into the destination (recipient) address space"?
>>> >>
>>> >> >  - has been dequeued from application queue ?
>>> >> >
>>> >> >
>>> >> >> recipient. End-to-end acknowledgements (using messages) should be
>>> used
>>> >> >> if
>>> >> >> this guarantee is important to the user.
>>> >> >>
>>> >> >> IPC endpoints can be seen as interfaces (taps) to an internal
>>> reliable
>>> >> >> multidrop network where each endpoint has a unique address which is
>>> >> >> only
>>> >> >> valid for the lifetime of the endpoint. I.e. if an endpoint is
>>> >> >> destroyed
>>> >> >> and then recreated (with the same name), the new endpoint will
>>> have a
>>> >> >> new address (eventually endpoints addresses will have to be
>>> recycled
>>> >> >> but
>>> >> >> not for a very long time). Endpoints names do not necessarily have
>>> to
>>> >> >> be
>>> >> >> unique.
>>> >> >
>>> >> > How widely these addresses are unique: inside one VM, multiple VMs
>>> under
>>> >> > the same host, multiple devices on a LAN (VLAN), ...
>>> >> Currently, the scope of the name and address space is defined by the
>>> >> implementation. Perhaps we should define it? My current interest is
>>> >> within an OS instance (bare metal or virtualised). Between different
>>> >> OS instances, I expect something based on IP to be used (because you
>>> >> don't know where those different OS/VM instances will be deployed so
>>> >> you need topology-independent addressing).
>>> >>
>>> >> Based on other feedback, I have dropped the contented usage of "IPC"
>>> >> and now call it "message bus" (MBUS).
>>> >>
>>> >> "MBUS endpoints can be seen as interfaces (taps) to an OS-internal
>>> >> reliable multidrop network"...
>>> >>
>>> >> >
>>> >> >
>>> >> >>
>>> >> >> Signed-off-by: Ola Liljedahl <ola.liljed...@linaro.org>
>>> >> >> ---
>>> >> >> (This document/code contribution attached is provided under the
>>> terms
>>> >> >> of
>>> >> >> agreement LES-LTM-21309)
>>> >> >>
>>> >> >
>>> >> >
>>> >> >> +/**
>>> >> >> + * Create IPC endpoint
>>> >> >> + *
>>> >> >> + * @param name Name of local IPC endpoint
>>> >> >> + * @param pool Pool for incoming messages
>>> >> >> + *
>>> >> >> + * @return IPC handle on success
>>> >> >> + * @retval ODP_IPC_INVALID on failure and errno set
>>> >> >> + */
>>> >> >> +odp_ipc_t odp_ipc_create(const char *name, odp_pool_t pool);
>>> >> >
>>> >> > This creates (implicitly) the local end point address.
>>> >> >
>>> >> >
>>> >> >> +
>>> >> >> +/**
>>> >> >> + * Set the default input queue for an IPC endpoint
>>> >> >> + *
>>> >> >> + * @param ipc   IPC handle
>>> >> >> + * @param queue Queue handle
>>> >> >> + *
>>> >> >> + * @retval  0 on success
>>> >> >> + * @retval <0 on failure
>>> >> >> + */
>>> >> >> +int odp_ipc_inq_setdef(odp_ipc_t ipc, odp_queue_t queue);
>>> >> >
>>> >> > Multiple input queues are likely needed for different priority
>>> messages.
>>> >> >
>>> >> >> +
>>> >> >> +/**
>>> >> >> + * Resolve endpoint by name
>>> >> >> + *
>>> >> >> + * Look up an existing or future endpoint by name.
>>> >> >> + * When the endpoint exists, return the specified message with the
>>> >> >> endpoint
>>> >> >> + * as the sender.
>>> >> >> + *
>>> >> >> + * @param ipc IPC handle
>>> >> >> + * @param name Name to resolve
>>> >> >> + * @param msg Message to return
>>> >> >> + */
>>> >> >> +void odp_ipc_resolve(odp_ipc_t ipc,
>>> >> >> +                  const char *name,
>>> >> >> +                  odp_ipc_msg_t msg);
>>> >> >
>>> >> > How widely these names are visible? Inside one VM, multiple VMs
>>> under
>>> >> > the same host, multiple devices on a LAN (VLAN), ...
>>> >> >
>>> >> > I think name service (or address resolution) are better handled in
>>> >> > middleware layer. If ODP provides unique addresses and message
>>> passing
>>> >> > mechanism, additional services can be built on top.
>>> >> >
>>> >> >
>>> >> >> +
>>> >> >> +/**
>>> >> >> + * Monitor endpoint
>>> >> >> + *
>>> >> >> + * Monitor an existing (potentially already dead) endpoint.
>>> >> >> + * When the endpoint is dead, return the specified message with
>>> the
>>> >> >> endpoint
>>> >> >> + * as the sender.
>>> >> >> + *
>>> >> >> + * Unrecognized or invalid endpoint addresses are treated as dead
>>> >> >> endpoints.
>>> >> >> + *
>>> >> >> + * @param ipc IPC handle
>>> >> >> + * @param addr Address of monitored endpoint
>>> >> >> + * @param msg Message to return
>>> >> >> + */
>>> >> >> +void odp_ipc_monitor(odp_ipc_t ipc,
>>> >> >> +                  const uint8_t addr[ODP_IPC_ADDR_SIZE],
>>> >> >> +                  odp_ipc_msg_t msg);
>>> >> >
>>> >> > Again, I'd see node health monitoring and alarms as middleware
>>> services.
>>> >> >
>>> >> >> +
>>> >> >> +/**
>>> >> >> + * Send message
>>> >> >> + *
>>> >> >> + * Send a message to an endpoint (which may already be dead).
>>> >> >> + * Message delivery is ordered and reliable. All (accepted)
>>> messages
>>> >> >> will
>>> >> >> be
>>> >> >> + * delivered up to the point of endpoint death or lost connection.
>>> >> >> + * Actual reception and processing is not guaranteed (use
>>> end-to-end
>>> >> >> + * acknowledgements for that).
>>> >> >> + * Monitor the remote endpoint to detect death or lost connection.
>>> >> >> + *
>>> >> >> + * @param ipc IPC handle
>>> >> >> + * @param msg Message to send
>>> >> >> + * @param addr Address of remote endpoint
>>> >> >> + *
>>> >> >> + * @retval 0 on success
>>> >> >> + * @retval <0 on error
>>> >> >> + */
>>> >> >> +int odp_ipc_send(odp_ipc_t ipc,
>>> >> >> +              odp_ipc_msg_t msg,
>>> >> >> +              const uint8_t addr[ODP_IPC_ADDR_SIZE]);
>>> >> >
>>> >> > This would be used to send a message to an address, but normal
>>> >> > odp_queue_enq() could be used to circulate this event inside an
>>> application
>>> >> > (ODP instance).
>>> >> >
>>> >> >> +
>>> >> >> +/**
>>> >> >> + * Get address of sender (source) of message
>>> >> >> + *
>>> >> >> + * @param msg Message handle
>>> >> >> + * @param addr Address of sender endpoint
>>> >> >> + */
>>> >> >> +void odp_ipc_sender(odp_ipc_msg_t msg,
>>> >> >> +                 uint8_t addr[ODP_IPC_ADDR_SIZE]);
>>> >> >> +
>>> >> >> +/**
>>> >> >> + * Message data pointer
>>> >> >> + *
>>> >> >> + * Return a pointer to the message data
>>> >> >> + *
>>> >> >> + * @param msg Message handle
>>> >> >> + *
>>> >> >> + * @return Pointer to the message data
>>> >> >> + */
>>> >> >> +void *odp_ipc_data(odp_ipc_msg_t msg);
>>> >> >> +
>>> >> >> +/**
>>> >> >> + * Message data length
>>> >> >> + *
>>> >> >> + * Return length of the message data.
>>> >> >> + *
>>> >> >> + * @param msg Message handle
>>> >> >> + *
>>> >> >> + * @return Message length
>>> >> >> + */
>>> >> >> +uint32_t odp_ipc_length(const odp_ipc_msg_t msg);
>>> >> >> +
>>> >> >> +/**
>>> >> >> + * Set message length
>>> >> >> + *
>>> >> >> + * Set length of the message data.
>>> >> >> + *
>>> >> >> + * @param msg Message handle
>>> >> >> + * @param len New length
>>> >> >> + *
>>> >> >> + * @retval 0 on success
>>> >> >> + * @retval <0 on error
>>> >> >> + */
>>> >> >> +int odp_ipc_reset(const odp_ipc_msg_t msg, uint32_t len);
>>> >> >
>>> >> > When data ptr or data len is modified: push/pull head, push/pull
>>> tail
>>> >> > would be analogies from packet API
>>> >> >
>>> >> >
>>> >> > -Petri
>>> >> >
>>> >> >
>>> >> _______________________________________________
>>> >> lng-odp mailing list
>>> >> lng-odp@lists.linaro.org
>>> >> https://lists.linaro.org/mailman/listinfo/lng-odp
>>> >
>>> >
>>>
>>>
>>
>

_______________________________________________
lng-odp mailing list
lng-odp@lists.linaro.org
https://lists.linaro.org/mailman/listinfo/lng-odp

Re: [lng-odp] [RFC] Add ipc.h

Reply via email to