This RFC is a follow up to
http://xenomai.org/pipermail/xenomai/2016-May/036253.html.

To sum up the issue, calling Cobalt's ioctl() implementation on a file
descriptor pointing at a RTDM named device, or pointing at a regular
character device, may create overhead for SCHED_WEAK threads, when/as
the ioctl request should be processed from the ioctl_nrt handler. This
is due to extra mode switches, illustrated as follows (Cobalt syscalls
are enclosed by __RT(), glibc services by __STD()):

[Secondary mode]           <== mode switch ==>       [Primary mode]

app:__RT(ioctl(fd, ...))
         |
         +-----------------------------------------> driver:ioctl_rt
                                                           |
                                           returns -ENOSYS |
                                                           |
driver:ioctl_nrt <-----------------------------------------+

Since SCHED_WEAK threads normally run in the Linux domain, and Cobalt
always starts probing the ioctl_rt() routine for handling the request,
a useless double mode switch happens in those particular cases.

The rationale behind probing the ioctl_rt() handler first, is that
SCHED_WEAK threads are supposed to wait for events from, or share
non-critical resources with real-time Cobalt threads and/or devices,
and that normally happens with the help of the Cobalt scheduler, which
requires the caller to run in primary mode. In short, this is the
runtime scenario Cobalt favors because this the reason for running
SCHED_WEAK threads in the first place.

This leads to some clarification: SCHED_WEAK is not meant to run plain
regular POSIX threads that don't interface with the real-time
sub-system; for such use case, one should call the regular
__STD(pthread_create()) service to create the thread, not libcobalt's
__RT(pthread_create()). Such thread would be able to issue RTDM
ioctl() calls as well, ending up into the driver's ioctl_nrt() handler
directly.

Back to the issue, we have three options for fixing the overhead
described above for threads that actually need to run in the
SCHED_WEAK class:

1- Cobalt could figure out whether the incoming file descriptor is
  actually managed by RTDM, before switching to primary mode if so.
  This way, regular file descriptors would be rejected early, before
  any mode switch is attempted, and libcobalt could hand over the
  request to the (g)libc in such an event. The ioctl request that has
  to be processed from a plain Linux context could then be handled by
  a simple chardev driver.

2- We could allow the application to tell Cobalt that RTDM I/O calls
  issued on a given file descriptor are primarily directed to the
  non-rt handler in the driver. Typically, an open mode flag such as
  "O_WEAK" could do the job; such flag would affect requests issued
  from SCHED_WEAK or regular threads unconditionally, or from
  real-time threads provided no rt handler is defined by the
  driver. In other cases, it would be ignored. This way, we would not
  allow the application to shoot itself in the foot by bypassing the
  RT handler inadvertently.

  e.g.:

    fd = open(some_rtdm_device, O_RDWR | O_WEAK);
    ret = ioctl(fd, SOMEDEV_RTIOC_FOO, &opt_arg);

  Pros: Applicable to all RTDM I/O requests (ioctl/read/write).

  Cons: Breaks the ABI with the introduction of the O_WEAK open flag.
  Complex semantics, given that SCHED_WEAK threads may behave
  differently than members of real-time scheduling classes for the
  same ioctl request on the same file descriptor.  Leaves the decision
  about the best mode to run a request implemented by a driver to the
  application, which seems odd.

3- We could introduce a special tag for composing the RTIOC code of an
  ioctl request, that a driver would use to state the preference for
  running the request in relaxed mode. The existing adaptative switch
  (ENOSYS) would still be available for handling requests for which no
  preference has been defined.

  e.g.

  #define _IOC_RELAX                    15U
  #define _IOWRX(type, nr, size)        _IOWR(type, nr | (1U << _IOC_RELAX), 
size)

  #define SOMEDEV_RTIOC_FOO             _IOWRX(RTDM_CLASS_BAR, 10, &some_arg)

  Pros: Semantics is easy to grasp: the decision about the best mode
  to run any given request is left to the driver. It also implies the
  best practice of assigning an exclusive handler to each ioctl
  request, i.e. either _rt or _nrt, but not both.

  Cons: Breaks the ABI, by consuming a bit normally used to encode the
  ioctl number for the tag, restricting the namespace to 2^15 codes.
  Would not enable the mechanism for other RTDM I/O calls such as
  read/write(_nrt).

An implementation of option #1 is available from wip/handover in the
xenomai-3 repo. It is planned to merge these changes in 3.0.3, since
there is no reason for such overhead to exist with non-RTDM file
descriptors. Besides, it has no impact on the ABI, this is purely a
kernel-based optimization.

Options #2 and #3, or any other approach should be discussed if there
is interest in them. Also, as a corollary to option #3, I would be
interested to hear about real-world use cases involving any of the
read_nrt, write_nrt, recvmsg_nrt and sendmsg_nrt handlers of RTDM.

-- 
Philippe.

_______________________________________________
Xenomai mailing list
[email protected]
https://xenomai.org/mailman/listinfo/xenomai

Reply via email to