Hi, here is a new concern that seems to be introduced by recent microkernel developments, Coyotos as well as secure L4 variants. The problem did not exist in Mach, Minix or L4.X2. I am not sure if it existed in EROS or not (probably not).
The matter concerns the reliability of an RPC mechanism built on top of the IPC primitives, considering a rather simple client-server model. In recent systems, the reliability seems to be decreased, because failure in the server can lead to indefinite resource allocation in the client, if no additional provision is made. Motivation: One argument for composing operating systems from multiple servers is to increase the robustness of the system. Jorrit Herder (Minix3) proposed at the poster session of Eurosys 2006 a mechanism to restart crashed device drivers and other system services (potentially transparently to the user). To achieve this level of robustness, the damage that a crashed server can do needs to be contained to a manageable amount. Here is an example failure case that I want to see addressed: A client C makes a call to a server S. The server S requires to make a call to a device driver D to implement the service. While S is in the reply phase of the invocation to D, the device driver crashes and is removed from the system. Eventually, the client C gives up and exits (for example on user intervention). Now, what happens to the server S? In Mach, S moved a "send-once" capability to the reply port to D. At destruction of D's port name space, the kernel would generate a failure message and send it to the reply port. S would be notified by the removal of S in this way. In L4.X2, the calling thread in S would be blocked on a closed wait on the thread ID of the server thread in D. At the destruction of the server thread, the list of waiters queued on the thread is traversed, and the pending IPC system calls are aborted with an error (this is called "unwinding" in the source code). In the upcoming L4 versions, and in Coyotos, destruction of the receiver of a reply capability does not cause any action to be triggered: Pending RPCs are not aborted. This is because there is an extra level of indirection between the reply capability and the thread (first class receive buffer). In fact, the underlying mechanisms are sufficiently expressive to allow behaviours where the above semantics are not meaningful anymore: For example, there could be copies of the reply capability in different processes (the kernel does not keep reference counts). Or in fact the caller could create a new reply capability for the reply endpoint and use that in any imaginable fashion. Still, this lack of kernel support poses an appreciable challenge: If nothing replaces this functionality, the server S in the above scenario will just indefinitely hang in an RPC operation that can never complete. A resource has been leaked permanently. Here are a couple of ideas what could replace this functionality: * Whatever user program destroys the failed server process D, also takes care of the users of the process D. This solution requires significant structural overhead, and creates undesirable strong dependency structures in the system (for example, global managers). * The program S could use timeouts in the call to D. This solution requires significant structural changes to the system design, because now time becomes an important parameter in evaluating services. It can be tried to argue that this is desirable anyway. * Following Mach, special "send-once" capabilities are introduced that implement the send-once semantics. Here are the semantics expressed in terms of Coyotos: When copied, the source capability is invalidated (so the number of send-once capabilities to a given object is a system invariant under capability copy operations). If a send-once capability is dropped, the kernel generates a message to any enqueued first class receive buffer. At task destruction, the space bank can scan the capability pages of the destroyed task and drop all (send-once) capabilities. This has the disadvantage that it makes task destruction somewhat more expensive. The cost for doing the cleanup is at least bounded by the number of capabilities the process can allocate, and the destruction of all capabilities collectively does not need to be atomic. I sort of have my eyes on the last solution. Jonathan, I remember that you did not like the send-once semantics, because (IIRC) it restricts the possible server designs. For example, a server can not keep several reply capabilities to the same caller in different worker processes. So if the server want to reply to a message it needs to make sure that the "send-once" reply capability ends up in the right worker process. However, in the possible use cases I can think of, there will be some negotiation among the worker processes about who will respond to the message anyway, so I can't really convince myself that this is a serious restriction. Maybe this is not the only reason you were against it. So, here are a couple of questions: 1) Is RPC robustness desirable/required, or is an alternative model feasible where machine-local RPC is as unreliable as IP/UDP network communication? 2) If it is indeed desirable, are there more possible solutions than the three approaches described above? 3) Are the costs of destroying send-once rights (and thus sending messages) acceptable? Given a positive answer to 1, and a negative answer to 2, are these costs in fact inavoidable? 4) In fact, if we consider persistence, can not the same mechanism above that was described to help with malicious or buggy software be used to deal with the planned and desired removal of device driver servers from the system at reboot of the persistent machine? IOW: As far as I understand, EROS had a logic to restart RPCs that were pending and which were sent across the boundary between the persistent and the non-persistent world. The above solution may provide a convenient and consistent approach to recover not only from accidential loss of a single driver, but also from a planned mass exodus as a reboot. Thanks, Marcus _______________________________________________ L4-hurd mailing list [email protected] http://lists.gnu.org/mailman/listinfo/l4-hurd
