Re: A niche for the Hurd - next step: reality check

olafBuddenhagen Wed, 29 Oct 2008 07:26:35 -0700

Hi,

On Tue, Oct 28, 2008 at 09:03:37AM +0100, Arne Babenhauserheide wrote:


> As the Brainstorm Phase is finished, it's time for a Reality Check: 

> #### Give back power to users:

While this was indeed the main idea behind the Hurd design, it is rather
vague for the most part. We have the architecture which potentially
gives users more power, but we have very few actual use cases for
that...

One problem is that many things are possible in principle, but can't
actually be used without certain tools to expose the functionality.

Other things are possible with the existing tools, but almost no one
knows that the possibilities exist, or how to actually use them...

Hurd development over the past years has focused exclusively on trying
to catch up with other systems -- nothing has been done on features that
can actually showcase the *advantages* of the Hurd...

It's not sufficient to talk about possibilities the architecture is
supposed to offer. We need actual solutions for actual use cases -- with
tools, examples, documentation.

Coming up with use cases is tricky though. You can't really make them up
from nothing. It's somewhat of a chicken-and-egg problem: We need more
people with actual requirements that call for actual solutions. But we
won't get more people without presenting some nice use cases to get them
interested...

zhengda's GSoC project was very nice in this regard: He had a specific
use case; we discussed what will be necessary for that; and he
implemented the needed components. More of that is needed.

> arbitrary mounts,

We have to be more specific here -- there are several aspects to this.

For one, there is the ability to mount a filesystem when you have access
rights to the device, on a node you own, without an fstab entry. This is
nice, but I never really understood why this is not possible on Linux.
In fact, someone told us on IRC that it is possible on FreeBSD...

For standard filesystems, this is not very interesting anyways: Usually
you do not have access rights to the devices anyways. (At least on a
standard Debian installation... Maybe it is different with distributions
that give logged in users access to various devices automatically.)

It gets more interesting for virtual filesystems, as here indeed the
user usually has all necessary permissions. IIRC fusermount on Linux
allows mounting without fstab entries as well though, and the
filesystems interesting here are usually implemented with FUSE -- so no
distinctive feature here. Some FUSE modules (like sshfs) even come with
special scipts to ease this process.

Of course it's nice that we do not need different methods for mounting
standard filesystems and virtual filesystems, but instead can do all
with the same settrans command (or mount wrapper). This is not likely to
cause raptures, though...

A more serious difference is the ability to mount image files directly
-- on Linux, you need loop devices for that, which is not very elegant,
and more importantly can only be done by root.

Perhaps the biggest difference is that on Linux, even with FUSE, users
are limited to a fixed set of trusted filesystems provided by root. On
the Hurd, a user can mount *any* filesystem, no matter where he got the
translator from.

> subhurds

Subhurds are nice, but in the current implementation do not really give
power to users, as they can be started only by root...

I'm not sure how much work it requires to fix this, but I guess it could
be done in a few months; perhaps weeks.

In any case, subhurds also fit into the lightweight virtualization
category...

> Simpler virtual computing environments - no need to setup XEN,
> everyone can just open up his/her computer for someone else by
> creating a new user account, and the other one can login and easily
> adapt the system for his/her own needs. 

We get the same problem here as above: It is possible in principle, but
we do not really have the tools to create such custom environments, nor
documentation and examples.

This of course also fits under lightweight virtualization.

> If most systems just differ by the translators setup on them, people
> could even transfer their whole environment from one computer to
> another one without needing root access or more root interaction than
> creating a new user account. "I want my tools" -> "no problem, just
> setup your translators". 

This is an interesting option -- I actually thought in a similar
direction already.

A precondition is that the same translators are available. This could be
incovenient, but shouldn't be a serious problem, as the user can always
compile them himself.

We also need to be relatively independent of the host system to make
sure we really get the environment we want. This requires attaching at a
rather low level... Certainly possible, but again we need tools and
documentation.

> Also it would be possible to just open an account for stuff like
> joining the "World Community Grid" allowing for easier sharing of CPU
> time. 

Not sure what you mean here...

> #### Nice features
> 
> Another example of features which would be easily possible with the
> Hurd: 
> 
> * media-player translator: - settrans play /hurd/mediaplayer_play - cp
> song1.ogg song2.ogg play - # -> files get buffered and played.
> 
> or even: 
> 
> * cp ftp://foo/bar/ogg play

Well, regarding the specific examples, such things aren't very
interesting anymore: With FUSE, this can all be done on Linux as well
nowadays... It's much harder to find examples for unique possibilities
now.

Note that translators are actually much more powerful than FUSE: While
the Hurd filesystem is completely decentralized, with FUSE all
filesystem access still passes through the kernel's VFS layer. This
means that FUSE can only be used with trusted modules, as already
pointed out above. It also means that FUSE is limited to standard
filesystem semantics, while Hurd translators can implement whatever they
want.

It is possible to change the behaviour in any aspect, including the way
file name lookup works. Admittedly the only specific use case I know is
the possibility to implement namespace-based translator selection with a
set of normal translators, without any changes to the Hurd itself.

It is also possible to extend the filesystem interfaces, adding new RPCs
and options as needed. This allows using the filesystem for
communication, yet implementing domain-specific interfaces where
standard filesystems are too unefficient or cumbersome. A sound server
would be one possible use case. (Again, I can't really think of many...)

> that's KDEs fabled network transparency on the shell level. 

Well, when designing gvfs/gio (which is the successor of gnomevfs, i.e.
the counterpart of KDE's network transparency layer), it was
contemplated to allow using all the modules not only with the native
GNOME interfaces, but also by legacy applications through FUSE. (Not
sure whether it was implemented in the end, but at least it would be
possible.)

OTOH, it was also contemplated in gvfs/gio (again, I don't know about
the final status) to leave POSIX behind, and implement a completely new
filesystem API, which for example would allow reading/writing a whole
file in a single atomic operation. FUSE can't map such new interfaces,
so Hurd translators would be in a better position here indeed.

Now let's get away from the specific example, and look at "nice
features" in general. This is indeed related to the "empowering users"
bit above -- in fact, most of the things that fit there would just as
well fit in here. And the problems are the same as well: The
possibilities exist in principle, but no tools to make use of them, no
documentation, no examples.

About the only really unique feature I'm personally aware of besides of
the ones already mentioned, is the ability to give a user's processes
new group permissions in a running session. I never used it so far
however, and I wouldn't know how to use it when I need it...

So again, we need specific, well documented use cases. A lot of them.
None of these nice features are so breathtaking that people would flock
to the Hurd because of one or two examples. We need an extensive
collection of nice solutions, if we want to convince people that this is
something they will profit from in everyday use.

One thing I believe could stand out and prove a "killer feature" on it's
own, is the namespace-based translator selection stuff. However, this is
not there yet -- the way it looks now, it will need a while still to be
ready for general use.

And when the development is done, the really hard part will begin:
Pushing it to the "masses". The translator selection stuff is not useful
as long as it only lies around somewhere, and waits to be invoked. It's
only really convenient if it runs all the time on the home directory, so
the magic file names can be used at any time. It should be invoked the
moment the user logs in.

I guess it's not too hard to come up with a mechanism to allow for that;
but it must be readily available, so users can activate it easily. Even
better if it was active by default...

I fear though that such radical things have little chance in Debian --
so perhaps we actually need a custom distribution for that :-(

> #### Advanced lightweight virtualization
> 
> There is also the whole area I called "advanced lightweight
> virtualization" (see
> http://tri-ceps.blogspot.com/2007/10/advanced-lightweight-virtualization.html),
> i.e. the ability to create various kinds of interesting
> subenvironments. Many use cases are covered by much bigger fish; but
> the flexibility we offer here could still be interesting: I think the
> middle grounds we cover between directly running applications, and
> full isolation through containers or VMs, are quite unique. This could
> simplify management of demanding applications for example, by
> partially isolating them from other applications and the main system,
> and thus reducing incompatibilities. Creating lightweight software
> appliances sounds like an interesting option.

Another instance of the same problem. The possibilities are there; we
need to make use of them...

> - All-in-one out-of-the-box distro running a webserver for crash-proof
> operation. 

I don't really see any advantages the Hurd would offer for ordinary
software appliances compared to other systems. The whole point of the
Hurd is giving more power to the users, and software appliances are
precisely about needing as little intervention as possible...

However, I dwelt on an interesting variation above: For *virtual*
appliances the Hurd might be interesting -- using hurdish
subenvironments instead of full virtualisation, they could be more
efficient and convenient.

> #### Easier access to low-level functions
> 
> One important use is for very technical people, who don't always go
> with standard solutions, but rather use new approaches to best solve
> their problems, and will often find traditional kernels too limiting.

As you made a distinction between target groups and use cases, wouldn't
this better fit in the first category?...

Anyways, same problem again: Examples, documentation...

> Another interesting aspect is application development: With the easily
> customized/extended system functionality, and the ability to contain
> such customizations in subenvironments, I believe that Hurd offers a
> good platform for much more efficient development of complex
> applications. Application developers can just introduce the desired
> mechanisms on a very low level, instead of building around existing
> abstractions. The extensible filesystem in particular seems extremely
> helpful as a powerful, intuitive and transparent communication
> mechanism, which allows creating truly modular applications.

> #### The possibility to create more efficient and powerful desktop
> environments

These really go together.

And again the same problem. It doesn't help to claim that our
architecture helps software developers. The hurdish interfaces are
hardly documented, and there are absolutely no examples of how they can
be used by application software.

I mentioned the desktop thing as one possible showcase. The problem here
is that it is a task for many years... Would be nice to have some
hurdish applications for showing around, in a closer timeframe.

> - operating system study purposes as its done with minix

I wonder what would be necessary to make the Hurd more interesting in
this aspect. Probably mostly boils down to various kinds of
documentation as well...

> - Having a _complete_ GNU System

Indeed, this is an important aspect. The Hurd would be long dead, if it
wouldn't enjoy constant popularity by being the official GNU kernel...

Yet I doubt that this alone is a sufficient niche for the Hurd to
prosper.

It could be useful in other ways though: I mentioned above that a custom
distribution might be necessary to showcase the possibilities the Hurd
offers. One interesting thing planned for the GNU system is making use
of the extendable filesystem, to implement an entirely filesystem-based
package manager. (Not requiring any explicit package database.)

The GNU system suffers from severe neglect however: For many years,
nobody was much interested in working on releasing a complete GNU
system. AMS at some point started nagging, and RMS appointed him "GNU
release manager". He did actually work on it somewhat; however, with his
behaviour he pissed of any potential contributors. So while he takes the
credit for reviving the idea of actually releasing a GNU system, he at
the same time prevented it from actually taking off... And after he was
banned from the main Hurd list and channel, he lost interest
alltogether, and the GNU system is stalled again -- only worse, as he is
still the official release manager, so anyone who tries to pick it up
again will have to deal with him...

There are actually two other possible niches I meant to mention, but
somehow seem to have forgotten:

- Safely running dangerous applications

After the demise of Hurd/L4, some of the developers became interested in
high security systems, and specifically Coyotos. Coyotos is a system
that consequently implements the Principle Of Least Authority, by
splitting up the system and all applications in very small components,
and confining them as well as possible. Much oversimplified it means
that the web browser for example only gets access to certain network
ports and certain screen windows, but no access to the user's files,
except on explicit request. Also the browser consists of many individual
components with even less authority: The network loader has access only
to the network, the layout engine has access to no external resources at
all, and only the user interface can ever access files. This way, even
if one of the componets gets compromised, it can't do much damage, as
the component having network access doesn't have access to the user's
files, and the component having access to the user's files doesn't have
network access.

So for a while some Hurd developers were contemplating a system based
partially on Coyotos, and quite similar to Coyotos in fact. However,
while such a highly secure system might be a good niche, it would have
little to do with the Hurd really. After a while they realized that this
is not quite the right direction, and gave up on the idea.

All the time I was convinced that if not going to such extremes as
Coyotos -- implementing POLA throughout the system -- but instead only
trying to confine some of the most dangerous applications, this could be
implemented quite well on the existing Hurd.

A framework for confining individual applications is really just one
possible use case of the hurdish subenvironments. Writing the tools
necessary for that should be quite doable in a few months. It's probably
not really much coding -- most of the work would be figuring out how it
should be set up exactly.

Splitting up dangerous applications into smaller components would also
be possible, but much more work.

- Effective resource management

The current Hurd implementation suffers from the lack of proper resource
management. There are two (related) problems: One is that the kernel has
no idea how processes are using memory and other resources, so it can't
decide how to best distribute them. The other is that processes often
invoke other processes (servers) to do some work for them, and Mach has
no means to attribute this usage to the invoking task (resource
accounting). Both result in very poor resource utilisation: It's not
possible to drain caches under memory pressure, for example. The latter
also results it denial of service being really easy.

While DOS could probably be prevented with various static limits, it
would make the system considerably less flexible, and also result in
even poorer resource utilization.

Note that both of these problems are also present in monolithic systems,
but considerably less manifest: The kernel itself does do much more
(filesystems, network stack etc.), so it has a much better picture of a
good part of the resource use in the system -- kernel caches can easily
be drained on memory pressure.

And invoking server processes is considerably less common in monolithic
systems: Processes mostly only use resources themself, or invoke the
kernel -- in both cases, the kernel knows how to account the resources.

The lack of mechanisms in Mach for proper resource accounting, and the
lack of mechanisms that would allow applications to give hints about
their memory use, were the main motivation behind the Hurd/L4 port; and
these matters are also the subject of Neal's current research work.

Now the idea is that we could make a virtue out of necessity: Once we
have a proper resource management framework, we should be able not only
to catch up with traditional systems in this reagard, but in fact
surpass them.

However, this is a *lot* of work. It either means making changes to Mach
so fundamental that it effectively becomes a different kernel; or
porting to a different kernel right away. (Viengoos being the obvious
candidate.) It is work quite orthogonal to the original goals of the
Hurd, and requiring a lot of expertise.

So this unfortunately won't be a viable niche in the forseeable
future... Just mentioning it for completeness, as it something that has
been considered in this light before.

-antrik-

Re: A niche for the Hurd - next step: reality check

Reply via email to