Re: [HelenOS-devel] VFS nodes and file descriptors

Jiří Zárevúcky Wed, 17 Oct 2012 08:27:54 -0700

On 17 October 2012 15:21, Jakub Jermar <[email protected]> wrote:
> Hi Jiri,
>
> thanks for elaborating on your ideas, see my comments below.
>

Thanks for taking the time to read them.

> On 15.10.2012 22:07, Jiří Zárevúcky wrote:
>> Firstly, let me explain, “why nodes”.
>> The current VFS server internally keeps state as a “VFS triplet”. It
>> is basically an exact address of the file in currently running VFS
>> (server + file system driver). Outside the server, clients refer to
>> files using their filenames. Simple enough.
>
> In general, the clients are using file names only to get a file handle.
> Once the client has the file handle, the file name can both disappear
> (unlink) or change (rename). The file name is not important at all when
> you have a handle.
>

Yes. The problem is that the "file handles" we have now are not just
references to files. They are complete active IO streams built on top
of files. That is a lot of unnecessary assumptions about how we use
files. In this sense, I want to make the handles as simple as
possible, with no "clutter". For example, you should be able to work
with file that is impossible to open for IO for some reason and would
hang if you tried. The classical "file handle" doesn't work like that,
or at least it is confusing as hell when it does.

>> That means the most reasonable
>> approach is to deal with nodes the same way we deal with open files -
>> passing an index into a client-local table, that is managed
>> dynamically.
> ...
>> With file-descriptor-like handling comes an analogous problem of
>> passing nodes between processes (as has been noted before, acquiring
>> node becomes the only mechanism to access a file). The current VFS
>> server uses a generic mechanism for brokering a change in externally
>> managed state (for file descriptors), and this mechanism can be
>> utilised exactly the same way with nodes.
>
> This is basically how capabilities work. Client-local file handles are
> merely a specialization of this principle. In my view, we should either
> reuse (and, if needed, redefine the meaning of) file handles, or
> introduce a generic mechanism for capabilities so that we don't end up
> with dozen specializations of the capability mechanism. Another already
> existing specialization is IPC phones.
>

I think that what I'm doing right now is redefining file handles and
calling it "node" to avoid confusion.

>> Jakub J. noted that if the handling of references to both is so
>> similar, we can just as well unify them in some ways, so we don’t have
>> two distinct mechanisms for essentially the same thing.
>> Well, after thinking about it for some time, I must say that we don’t
>> even need VFS to deal with two different types of objects, but more
>> about that later.
>
> Ok, let's see.
>
>> Next idea: No file descriptors, just nodes.
>
> What would the local node identifiers be? Integers? If so, that's
> effectively what file handles are. We don't really need to support POSIX
> read/write() on the VFS level (even now, as discussed during the last
> project meeting) and provide only pread()/pwrite() (coincidentally also
> POSIX), because that's what the VFS_OUT_READ/WRITE actually does. The
> only problem remains access mode which you address in the final idea below.
>

Yes. It's file handles, just using different name to be clear what I mean.

>> Final idea:
>> Just nodes as before, but no violation of the interface abstraction
>> through type checking. Instead, there is both OPEN and CLOSE operation
>> on a node, but instead of returning anything, it manages a
>> client-local counter of access “intentions”. As long as there have
>> been any calls to OPEN not yet matched by CLOSE, the node can be used
>> for I/O. This keeps the simplicity of only dealing with nodes in VFS
>> server, allowing to implement the stream metaphor purely in libc,
>> while keeping track of clients that read and write contents.
>
> So you want to keep the access information in the client-local table of
> ID's that identify nodes. Or do you rather want to allocate a separate
> per-client state for each node?
>

These two should be equivalent.
It doesn't even matter if the same node has different IDs with
different access information, or if each node has always an unique ID,
or if there are multiple IDs and a single per-node access state. It's
an implementation detail. The only important thing is that a corrupted
state caused by a mismanagement on the client's part will always be
fixed when the client disconnects. I keep the access information in
the client-local table of accessed files, because it would be too
complicated to do any other way.

>> So there you have it, a single conceptually simple kind of object
>> exposed by the VFS server, leaving libc to implement the UNIX API if
>> we want to (though I’m utterly convinced we can do better).
>
> If I understood your proposal correctly, you are basically suggesting:
>
> - stop using term file handle, but continue to use the underlying
> principle to refer to file system nodes (both open and other types)
>

I called it "node" to make it clear it doesn't behave like current
"file handle", which is conceptually an active stream.
Secondly, open node is not a type (which was something I considered
for a while but abandoned, since it's too weird), but rather, each
client manages its current access mode for the files it accesses. It
is more of a special reference counting, really.

> - remove the position pointer from the file/node table and leave it in
> libc; VFS will cease to support read/write/lseek() directly on the
> client side of its interface and will support only an equivalent of
> pread() and pwrite()

Yes, that is correct.

> - the intention counter will distinguish an open node from other kinds
> of nodes; and will specify the exact access mode for a given client ID
> for the node
>

As I said, it's rather a "being used for" attribute that is
client-specific. But essentially, yes. Another way to think about it
is that the VFS acts as a multiplexer for the API filesystem driver
provides (where open/close are essentially "prepare for IO" and
"finalize IO" operations).

> - the file/node table will in fact be a table of tuples (intention
> counters, vfs_node_t *) or it will be a table of vfs_node_t *, but each
> vfs_node_t will have to track client intention counters individually;
> you may or may not want to get rid of one level of indirection which now
> exists in the file descriptor table
>

Yes. Table of tuples as you described.

> If this is correct, I think it represents certain unification or even
> simplification of VFS (and at the same time more complexity for libc),
> but the basic principles remain in place. It's more like that you don't
> like to call the node IDs file handles, but they are essentially the
> same or at least very similar thing.

Similar in implementation, but not same conceptually. I think it's
very important to be aware of the abstract ideas concrete
implementations try to represent. Or maybe I'm just thinking too much
and giving undue attention to unimportant things. That is possible.

Actually, I myself don't like the name I use, but explaining which
version of "file handle" I mean at each occurrence or the term would
be too confusing.

> I may have misunderstood where you
> want to store the intention counters, but it does not really change that
> much, IMHO.
>

As I've said, they are client-specific and the rest is an
implementation detail. That much is relevant for the outside view.

I hope I shed some more light on what I actually mean by what I write.
I am well aware that I can be quite confusing at times. :)

-- Jirka Z.

_______________________________________________
HelenOS-devel mailing list
[email protected]
http://lists.modry.cz/cgi-bin/listinfo/helenos-devel

Re: [HelenOS-devel] VFS nodes and file descriptors

Reply via email to