Hi.

I am Ethan. Just reaching out to express interest in this upcoming GSOC. I
stumbled across a very interesting project, refactoring asynchronous IO
<https://wiki.netbsd.org/projects/project/aio/>, which looks to require
quite a lot of skillful tact and vision. And to go through most of the
questions on your project application guideline
<https://wiki.netbsd.org/projects/application/>.

About the project?

We will be refactoring the low-level I/O pipeline within the kernel to make
all requests asynchronous by default. Which means that with regards to the
availability of a resource, instead of immediately blocking on that
availability, you would rather work with respect to some callback.

So the first step is to develop some infrastructure for a new internal API
to facilitate these asynchronous callbacks. This new API will intermingle
with all kinds of internal synchronization primitives. The following is
kind of what I have in mind.

struct aio_ops {
  int (*callback)(struct aio_ops*);
  void *private;

  int fd;
  void *buffer;
  size_t length;
  off_t offset;
  int ops;

  kmutex_t mtx;
  kcondvar_t delivered;
  kcondvar_t available;

  int id;
  int status;
  int error;
  ...

  TAILQ_ENTRY(aio_ops) entries;
};

TAILQ_HEAD(aio_ops_queue, aio_ops);
struct aio_service_pool {
  struct aio_ops_queue ops_queue;

  int (*write)(struct aio_ops *);
  int (*read)(struct aio_ops *);
  int (*sync)(struct aio_ops *);
  int (*cancel)(struct aio_ops *);
  ...

  kmutex_t mtx;
  kcondvar_t pending;
};

Essentially each asynchronous operation is defined using *struct aio_ops*,
each operation includes a callback function along with synchronization
primitives to track the delivery of the callback as well as the
availability of the resource. Asynchronous operations are queued to a
designated service pool. Each service pool handles a specific class of
operations. These pools are backed by per-CPU kernel threads, and those
threads remain dormant until operations are pending. And probably it would
be a good idea to implement load-balancing of pending operations across
multiple service pools of the same classification down the road.

Implementation has to start from the lowest abstraction layer of the I/O
path to ensure asynchronicity between the stack. The issue is allowing
these servicing pools to concurrently invoke multiple operations within a
single thread without blocking on any individual operation. So we will
likely have to begin work at the level of the block device, or at least
where the most fundamental I/O primitives are defined, before moving
upwards.

So the idea is that this protocol functions as an intermediary layer within
the I/O path that supports both asynchronous and synchronous modes. For
synchronous behaviour, setting the callback to NULL while using the calling
thread itself as the servicing thread, which then blocks directly on
the availability of the resource. This should alleviate any potential
overhead associated with this interface, basically providing a zero-cost
abstraction for synchronous operations within a broader asynchronous
framework.

This design is still in the early stages and is very tentative, and is
definitely open to all kinds of refinement. The crux of this project will
be getting this protocol right. Everything else should be relatively
straightforward. So the first priority is to design and implement a proper
internal protocol for asynchronous I/O. Then begin integrating this
protocol at the lowest abstraction layer of the I/O path, moving upwards
from there. And then finally revising the user-exposed POSIX AIO interface.

About me?

I am going into my third year of Computer Science at the University of
Alberta. Over the past few years, I have spent a good amount of time
working on hobbyist projects. Recently, I set up a minimal build of NetBSD
and created an efficient workflow with the help of some scripts. Yesterday
I submitted a patch for PR 58922, just something real simple to get into
the groove of contributing. While I am not yet fully familiar with every
single internal structure within NetBSD, I have spent a lot of time working
with other monolithic POSIX kernels, and I have found that the knowledge is
quite transferable, so you can pick things up pretty quickly. The later
stages of this project will require knowledge of the POSIX AIO interface.
But really the entire project will require in-depth knowledge of POSIX, as
revising the I/O pipeline will involve working with many core subsystems. I
am quite comfortable with POSIX, having spent a lot of time working with
and implementing POSIX interfaces.

One of my projects, pastoral <https://github.com/ethan4984/pastoral>,
implements quite a large subset of POSIX standards, including signals, Unix
domain sockets, and a Unix file system, with support for over a hundred
POSIX calls. This compliance allowed me to cross-compile and port quite a
lot of programs, software such as Python, Xorg, Bash, GCC, the GNU
coreutils, and more.

More recently, I have been experimenting with microkernels, dufay
<https://github.com/ethan4984/dufay>. This project is still under
development, but most of the core interfaces for IPC and RPC are
implemented. It is a compatibility-based microkernel with a per-processor
user space scheduler that implements a CFS. We also have a pretty
interesting protocol that allows for the very efficient handling of IRQs,
exploiting tricks of shared memory, not requiring any additional routing or
context switching, unlike other microkernels.

When it comes to these projects, I usually work on them with a core group
of individuals, which has been a rewarding experience.

I am excited about this project, definitely going to learn a lot of this
summer working on a product of this calibre with real professionals. If
anyone wishes to contact me, this email is perfect. Planning on formally
submitting a proposal for GSOC soon, so I hope for some feedback or
comments.

Thanks.

Reply via email to