On Fri, Jan 19, 2018 at 7:35 AM, Ricardo Wurmus <rek...@elephly.net> wrote:

>
> Hi Brent,
>
> > I put a screencast of the lecture on youtube:
> >
> > https://www.youtube.com/watch?v=JwsuAEF2FYE
>
> thank you.  This was very interesting.  The introduction to Mach IPC and
> memory management was especially good.  I wonder if a shorter variant of
> this part of the lecture could be used by new contributors as an
> alternative to reading the Mach kernel postscript books.
>

That's a good idea, and I can leverage the work I've already done with the
graphics.  I'll have to think about what we might want in a second video
that isn't in the first one.  Any suggestions?

Personally, I’m very interested in a Single System Image Hurd cluster; I
> still have a bunch of unused Sun cluster nodes with x86_64 CPUs, but
> sadly there is no high-speed network to connect them all (just regular
> old 1G network cards).
>
> In your experience, is high-speed network very important or are there
> ways to make it unlikely that memory has to be transferred across nodes?
>

My experience is that we're nowhere close to 1 Gbps vs 40 Gbps being an
issue.

My primary test environment is a virtual machine that talks to itself as
the "remote", and the performance problems there are severe enough that
running remote programs results in a delay that is noticeable to the human
running the program.

Actually, we're barely even at that point.  Stock hurd can't even execute
remote programs.  You either need to patch the exec server so that it reads
binaries and shared libraries instead of memory mapping them, or use the
multi-client libpager that I've been working on the past six months, but is
still failing some test cases.

See, for example,
http://lists.gnu.org/archive/html/bug-hurd/2016-08/msg00099.html, or do a
reverse chronological search on bug-hurd for "libpager".

At the very end of that video lecture, I suggested how we might address the
performance problems in "netmsg", after we've got a multi-client libpager,
and after we've got a 64-bit user space, and after we've got SMP support.
(Who cares about a cluster where you can only use 4 GB and one core on each
node?)

First, netmsg needs to be rewritten so that it doesn't serialize all the
Mach traffic over a single TCP session.  That's probably its biggest
bottleneck.

After that, I think we should move our networking drivers into a shared
library, so that netmsg can access the PCI device directly.  That would
avoid a lot of context switches, like netmsg <-> TCP/IP stack <-> network
driver.

And it's not as crazy as it might first seem.  Networking devices are
increasingly virtuallized.  Not only can you just fiddle a few software
options to add a virtual PCI device to a virtual machine, but Cisco's vNIC
cards can present themselves as anywhere from 1 to 16 PCI devices.  So just
fiddle a few options in your management console, and even your hardware can
now made a new PCI device appear out of thin air.

So, it makes sense to allocate an entire PCI device to netmsg, and let all
your inter-node Mach traffic run over one interface, while your normal
TCP/IP stack has a separate interface.  Of course, we also need to support
configurations where you can't do that, but I think raw PCI access is going
to be the way to go if you really want performance.

Those are my current thoughts on Hurd cluster performance.  In short,
priorities are:

1. multi-client libpager (nearly usable)
2. 64-bit user space
3. SMP
4. rewrite netmsg to avoid TCP serialization (and other issues)
5. raw PCI device access

    agape
    brent

Reply via email to