Hi. One of the topics briefly discussed in the last hangout was the distinction between “nodes” and “open files”. I said I would post some more elaborate explanation of how it works, so here it is.
This text is quite long. If you are uninterested in details of possible future VFS server, feel free to stop reading now. Also, most parts are probably uninteresting or irrelevant. You have been warned. Firstly, let me explain, “why nodes”. The current VFS server internally keeps state as a “VFS triplet”. It is basically an exact address of the file in currently running VFS (server + file system driver). Outside the server, clients refer to files using their filenames. Simple enough. In my approach, the namespace is a process-local view composed from the subtrees provided by the filesystems. This has a side-effect of filenames being process-local as well. The same name does not necessarily refer to the same file (and in fact, this is a very good thing for filesystem conventions - see how Plan9 utilises it). Thus, we need a mechanism to refer to the file directly. One way would be to use the VFS triplet itself. This has two serious problems. One - to enforce any kind of dynamic access rights (think untrusted binary blobs executed on a machine you’re doing bank transactions with), we must take a “capability system” approach. That is, if the process gains access to the node, it gains permission to work with it. That also means that the access information must be unguessable. VFS triplet is out of the question. The second problem is that such a node must be provided by an actual filesystem process, which restricts what we can do inside VFS server (not a big problem in general, but some simple generic extensions can be done very easily inside VFS server - e.g. union, pipes, composed hierarchies - and I exploit that as much as seems reasonable to simplify the overall design). By similar arguments, we can’t pass pointers (would crash server with invalid requests) or an index into global table (either guessable or unnecessary dependence on randomness). That means the most reasonable approach is to deal with nodes the same way we deal with open files - passing an index into a client-local table, that is managed dynamically. With file-descriptor-like handling comes an analogous problem of passing nodes between processes (as has been noted before, acquiring node becomes the only mechanism to access a file). The current VFS server uses a generic mechanism for brokering a change in externally managed state (for file descriptors), and this mechanism can be utilised exactly the same way with nodes. Jakub J. noted that if the handling of references to both is so similar, we can just as well unify them in some ways, so we don’t have two distinct mechanisms for essentially the same thing. Well, after thinking about it for some time, I must say that we don’t even need VFS to deal with two different types of objects, but more about that later. In further text, I will explain the basic structure (in broad strokes). First, how it looked like before Thursday. Then, how I intend it to look now. I attached scans of some pictures I drew in an attempt to make it more understandable (I wanted to draw them on computer, but I found it to be quicker to simply draw them by hand). Full arrows represent control-flow. When the arrow enters a full dot next to a box representing data, it means the control flow accesses or modifies the data. Dashed lines mean that the data or return value represents the object pointed to. Lines that enter the picture from the left side are IPC calls. All in all, the pictures are pretty confusing, so don’t count on them to explain anything. The original structure: See [scan #1]. There are three IPC calls that directly work with the client’s namespace. IPC calls that work on nodes are the management methods - link, unlink, looking up descendant, reference counting (node returned in lookup need not be newly created, hence the reference management), and opening nodes to get file descriptor. IPC calls that work on file descriptors are all the methods that access file’s contents. Although these are internally forwarded to node in some form, no direct IPC call involving contents through node exists. The diagram shows the node as representing a physical file. That need not be the case, but I didn’t include the variety in this picture. Also, right-hand part of the picture was my attempt to explain how bindings (my replacement for mount points) work. I ran out of space on that one, so it’s not really helpful, sorry. Originally, permissions on a node was a state of the node, and read/write/etc calls checked that. There was intended to be another call that returns an instance of the node with restricted permissions. That didn’t work all that well and I made it simpler by adding a new kind of node that wraps the node we need to restrict. The [scan #2] shows several implementations of the Node interface that serve different purposes. The two at the bottom are explained further down. Next idea: No file descriptors, just nodes. When you look at the first picture, you notice that the file descriptor only works as a proxy for a node accompanied by a set of permissions (or rather the mode of operation for sanity checks) and a position. A natural thought is to get rid of it, expose WRITE_AT and READ_AT calls to the client and implement conventional API within libc. This is *almost* good enough. There are some problems, though. Most importantly, for some kinds of nodes, you need to know whether the client “intends” to read or write, and you need to know, which. For example, pipes need that to know whether they should block or return EOF. Next next idea: A special kind of node to represent the client’s “intent” to work with the contents. This was a little weird idea. The basic approach was that there was another “wrapper” node implementation, OpenNode. When accepting a read/write call, the VFS server would first check the type of the node in question. If it was not OpenNode, the operation would fail. If it was OpenNode, it would be forwarded to it normally. All nodes would support an OPEN operation, that would return an OpenNode instance wrapping the receiver. In all other aspects, it would be a fairly standard node. The [scan #3] depicts the situation, and also suggests a better solution. Final idea: Just nodes as before, but no violation of the interface abstraction through type checking. Instead, there is both OPEN and CLOSE operation on a node, but instead of returning anything, it manages a client-local counter of access “intentions”. As long as there have been any calls to OPEN not yet matched by CLOSE, the node can be used for I/O. This keeps the simplicity of only dealing with nodes in VFS server, allowing to implement the stream metaphor purely in libc, while keeping track of clients that read and write contents. So there you have it, a single conceptually simple kind of object exposed by the VFS server, leaving libc to implement the UNIX API if we want to (though I’m utterly convinced we can do better). Any thoughts? PS: I spent a lot of time thinking over this and other things and getting my scanner to work, so all the progress I was able to make on the implementation was to get the already written code to compile. I hope next weekend will be more productive. Now I have a very clear picture of what each part should do exactly, so it shouldn’t take long until some workable prototype exists. -- Jirka Z. [scan #1] https://docs.google.com/open?id=0BwpNZtnCMAWIRC1FY25HYzc0M0k [scan #2] https://docs.google.com/open?id=0BwpNZtnCMAWIUUQwZ21QbkhLV0U [scan #3] https://docs.google.com/open?id=0BwpNZtnCMAWIc0VDbkltZS1kem8 _______________________________________________ HelenOS-devel mailing list [email protected] http://lists.modry.cz/cgi-bin/listinfo/helenos-devel
