IO Multiplexing

Ben Goldberg Fri, 12 Nov 2010 00:29:49 -0800

I would like to know, is perl6 going to have something like select
(with arguments created by fileno/vec), or something like IO::Select
(with which the user doesn't need to know about the implementation,
which happens to be done with fileno/vec/select), or only an event
loop.


I would recommend that there NOT be any sort of fileno exposed to the
user, unless he goes out of the way to get it -- any function (in
particular, posix functions) should simply take a perl filehandle, and
that function in turn would pull out the fileno (or fail
appropriately, if the filehandle doesn't have a fileno).  If users
want to know if filehandles correspond to the same underlying file,
then there could be a method -- perhaps $fh.uses_same_desciptor($fh2),
or somesuch.

If there's a select() builtin (and I'd much rather that there not be
-- it should be hidden away in a class, like perl5's IO::Select), I'd
very much hope that it would take and return Sets of filehandles, not
vec packed strings.  I'd prefer there not be one[**]

If there's something like perl5's IO::Select, it should be able to
"just work" regardless of whether the perl filehandles are sockets,
regular files, or user-created pure-perl filehandles (which might
never block, or which might use one or more normal filehandles
internally, which in turn might potentially block).  This is what I'd
prefer.

Lastly, if perl6 has an efficient enough built-in event loop, and
sufficiently lightweight coroutines (or maybe I should say fibers?),
then we might not need to have any kind of explicit multiplexing.

For example, any time user code does a read operation on a handle that
isn't (from the user code's point of view) in nonblocking mode, the
filehandle implementation would tell the the event loop to yield to it
when the handle becomes readable, then it would yield to the event
loop, then (once it gets back control) read from the handle.[*]

This provides lots of convenience, but it would resemble Java IO
before the NIO -- except with one fiber per handle instead of one
thread per handle.  Coroutines/green threads/fibers are much lighter
weight than "real" threads, but often aren't as fast as a well-written
select() loop specially written for the user's task.

Thus, I'd hope for perl6 to have an IO::Select, and automatically-
yielding [*] blocking IO, and not have a select() builtin.  [**]

[*] This is a simplification:
A) If a user explicitly marks a filehandle as not yielding to other
coroutines, it would do a blocking read (or whatever) instead of going
through the event loop rigmarole.
B) If perl6 was compiled with an asynchronous IO library (or is on
windows and is not using stdio and has (Read|Write)FileEX support),
then it might start the Async IO operation, tell the event loop to
wake it when the operation completes, then yield to the event loop.
C) Depending on circumstances, it *may* be more efficient to have the
event loop itself do the reading or other IO itself, and schedule the
fibers for which the IO was done, than to have the fibers do the IO.
TMTOWTDI.  This would be especially important if perl is compiled with
async IO -- the event loop might first wait for the fds to be readable/
etc, *then* start the async IO for those fds, then schedule the fibers
for which the performed IO has completed, thus minimizing the number
of outstanding async io operations.

[**] The main reason I'd prefer that perl6 not have a select() builtin
is that every time it's called perl would need to convert user-level
Sets of filehandles into the underlying implementations' versions of
them (fd_sets on unixy, fd_sets and/or an event queue handle on
windows), and then back to perl Set objects, and free up the
implementation version of the filehandle set... this is inefficient.

A well written IO::Select-like object could create (potentially empty)
versions of the OS's set of filehandles when it's created, add to that
set as needed, and NOT destroy that implementation-specific set until
the IO::Select object itself is destroyed.  Perl5's IO::Select does
this with the packed bitsets that it creates to pas to select.  It
could do improve it's efficiency by using fd_sets instead of
bitstrings, and not use the perl select(), but the C select(2)
instead.

Better still would be epoll.  In this case, avoiding repeated setup
makes an object multiplexer model enormously more efficient than
something like select().

Similarly, on windows, if we WSAEventSelect or WSAAsyncSelect to
create readability/ writability/ etc events for IO operations we want
to wait on, and [WSA]WaitForMultipleEvents as the blocking operation,
then having an object multiplexer (which keeps events between one call
to the next) is far better than a simple subroutine (which needs to
cancel those events after it blocks and before it returns).

IO Multiplexing

Reply via email to