[HelenOS-devel] VFS nodes and file descriptors

Jiří Zárevúcky Mon, 15 Oct 2012 13:08:27 -0700

Hi.
One of the topics briefly discussed in the last hangout was the
distinction between “nodes” and “open files”. I said I would post some
more elaborate explanation of how it works, so here it is.


This text is quite long. If you are uninterested in details of
possible future VFS server, feel free to stop reading now. Also, most
parts are probably uninteresting or irrelevant. You have been warned.

Firstly, let me explain, “why nodes”.
The current VFS server internally keeps state as a “VFS triplet”. It
is basically an exact address of the file in currently running VFS
(server + file system driver). Outside the server, clients refer to
files using their filenames. Simple enough.
In my approach, the namespace is a process-local view composed from
the subtrees provided by the filesystems. This has a side-effect of
filenames being process-local as well. The same name does not
necessarily refer to the same file (and in fact, this is a very good
thing for filesystem conventions - see how Plan9 utilises it).

Thus, we need a mechanism to refer to the file directly. One way would
be to use the VFS triplet itself. This has two serious problems. One -
to enforce any kind of dynamic access rights (think untrusted binary
blobs executed on a machine you’re doing bank transactions with), we
must take a “capability system” approach. That is, if the process
gains access to the node, it gains permission to work with it. That
also means that the access information must be unguessable. VFS
triplet is out of the question. The second problem is that such a node
must be provided by an actual filesystem process, which restricts what
we can do inside VFS server (not a big problem in general, but some
simple generic extensions can be done very easily inside VFS server -
e.g. union, pipes, composed hierarchies - and I exploit that as much
as seems reasonable to simplify the overall design).

By similar arguments, we can’t pass pointers (would crash server with
invalid requests) or an index into global table (either guessable or
unnecessary dependence on randomness). That means the most reasonable
approach is to deal with nodes the same way we deal with open files -
passing an index into a client-local table, that is managed
dynamically.

With file-descriptor-like handling comes an analogous problem of
passing nodes between processes (as has been noted before, acquiring
node becomes the only mechanism to access a file). The current VFS
server uses a generic mechanism for brokering a change in externally
managed state (for file descriptors), and this mechanism can be
utilised exactly the same way with nodes.

Jakub J. noted that if the handling of references to both is so
similar, we can just as well unify them in some ways, so we don’t have
two distinct mechanisms for essentially the same thing.
Well, after thinking about it for some time, I must say that we don’t
even need VFS to deal with two different types of objects, but more
about that later.

In further text, I will explain the basic structure (in broad
strokes). First, how it looked like before Thursday. Then, how I
intend it to look now. I attached scans of some pictures I drew in an
attempt to make it more understandable (I wanted to draw them on
computer, but I found it to be quicker to simply draw them by hand).

Full arrows represent control-flow. When the arrow enters a full dot
next to a box representing data, it means the control flow accesses or
modifies the data. Dashed lines mean that the data or return value
represents the object pointed to. Lines that enter the picture from
the left side are IPC calls. All in all, the pictures are pretty
confusing, so don’t count on them to explain anything.

The original structure: See [scan #1].

There are three IPC calls that directly work with the client’s
namespace. IPC calls that work on nodes are the management methods -
link, unlink, looking up descendant, reference counting (node returned
in lookup need not be newly created, hence the reference management),
and opening nodes to get file descriptor. IPC calls that work on file
descriptors are all the methods that access file’s contents. Although
these are internally forwarded to node in some form, no direct IPC
call involving contents through node exists.

The diagram shows the node as representing a physical file. That need
not be the case, but I didn’t include the variety in this picture.
Also, right-hand part of the picture was my attempt to explain how
bindings (my replacement for mount points) work. I ran out of space on
that one, so it’s not really helpful, sorry.

Originally, permissions on a node was a state of the node, and
read/write/etc calls checked that. There was intended to be another
call that returns an instance of the node with restricted permissions.
That didn’t work all that well and I made it simpler by adding a new
kind of node that wraps the node we need to restrict.

The [scan #2] shows several implementations of the Node interface that
serve different purposes. The two at the bottom are explained further
down.


Next idea: No file descriptors, just nodes.

When you look at the first picture, you notice that the file
descriptor only works as a proxy for a node accompanied by a set of
permissions (or rather the mode of operation for sanity checks) and a
position.
A natural thought is to get rid of it, expose WRITE_AT and READ_AT
calls to the client and implement conventional API within libc. This
is *almost* good enough. There are some problems, though. Most
importantly, for some kinds of nodes, you need to know whether the
client “intends” to read or write, and you need to know, which. For
example, pipes need that to know whether they should block or return
EOF.
Next next idea: A special kind of node to represent the client’s
“intent” to work with the contents.

This was a little weird idea. The basic approach was that there was
another “wrapper” node implementation, OpenNode. When accepting a
read/write call, the VFS server would first check the type of the node
in question. If it was not OpenNode, the operation would fail. If it
was OpenNode, it would be forwarded to it normally. All nodes would
support an OPEN operation, that would return an OpenNode instance
wrapping the receiver. In all other aspects, it would be a fairly
standard node.

The [scan #3] depicts the situation, and also suggests a better solution.

Final idea:
Just nodes as before, but no violation of the interface abstraction
through type checking. Instead, there is both OPEN and CLOSE operation
on a node, but instead of returning anything, it manages a
client-local counter of access “intentions”. As long as there have
been any calls to OPEN not yet matched by CLOSE, the node can be used
for I/O. This keeps the simplicity of only dealing with nodes in VFS
server, allowing to implement the stream metaphor purely in libc,
while keeping track of clients that read and write contents.

So there you have it, a single conceptually simple kind of object
exposed by the VFS server, leaving libc to implement the UNIX API if
we want to (though I’m utterly convinced we can do better).

Any thoughts?


PS: I spent a lot of time thinking over this and other things and
getting my scanner to work, so all the progress I was able to make on
the implementation was to get the already written code to compile. I
hope next weekend will be more productive.
Now I have a very clear picture of what each part should do exactly,
so it shouldn’t take long until some workable prototype exists.

-- Jirka Z.

[scan #1] https://docs.google.com/open?id=0BwpNZtnCMAWIRC1FY25HYzc0M0k
[scan #2] https://docs.google.com/open?id=0BwpNZtnCMAWIUUQwZ21QbkhLV0U
[scan #3] https://docs.google.com/open?id=0BwpNZtnCMAWIc0VDbkltZS1kem8

_______________________________________________
HelenOS-devel mailing list
[email protected]
http://lists.modry.cz/cgi-bin/listinfo/helenos-devel

[HelenOS-devel] VFS nodes and file descriptors

Reply via email to