Hi, On Wed, Oct 29, 2008 at 04:00:49PM +0100, Arne Babenhauserheide wrote: > Am Mittwoch 29 Oktober 2008 12:16:58 schrieb [EMAIL PROTECTED]:
> I'll digest it in little pieces and answer directly... I actually already considered splitting it up myself :-) > > > #### Give back power to users: > > > > While this was indeed the main idea behind the Hurd design, it is > > rather vague for the most part. We have the architecture which > > potentially gives users more power, but we have very few actual use > > cases for that... > > I attached one: > > - You have 1.000.000 files to store, which makes a simple "ls" damn > slow. - So you develop a simple container format with reduced metadata > and specialized access characteristics. - Now you want to make that > cotainer accessible via the normal filesystem. > > Please check the two attached presentations to see the pain this > causes using Linux. I must admit that I fail to read the "pain" in these presentations... The only problem with FUSE in this context seems to be performance. I wonder whether Hurd translators would do better on that score. The container stuff itself is quite interesting BTW. It's closely related to some things I have been pondering about -- yet another thing I'm meaning to blog about One Of These Days (TM)... I realized at some point that for some hurdish applications, we need a way to store fine-grained structured data. What is the best approach for that? One way is to put it into a file, using some structured file format. (XML, s-expr, or the likes.) The problem is that changing (adding/removing/replacing) individual pieces of data inmidst of the file is both awkward and inefficient: It requires rewriting the file starting from the affected region up to the end. Also, accessing individual data items is quite complicated, as it always requires a parser for the respective format. Storing as a large directory tree on the other hand allows for very easy and efficient access and updates of individual items. However, it takes a lot of disk space. (Due to redundant file metadata like permissions etc., and also the internal structure of the filesystem imposing a lot of overhead with many tiny files.) And working with a whole set of data items at once (e.g. copying a subtree, or replacing a whole subtree) becomes quite awkward. First I was thinking of some kind of DB translator, which stores the data in a normal file, but instead of storing the contents linearily, uses some internal allocation mechanism -- just like a full-blown DBMS. I soon realized though that this would be too rigid in many cases: Often it is useful to access the *same* data in different ways, depending on context. The storage mechanism and the access method are indeed quite orthogonal -- what we really want is the ability to access *any* data both through a directory tree or through a structured file interface as needed. Whether the data is actually stored in individual files, or in a container, should be totally transparent. So on the frontend we want a dual interface that allows accessing the data either as directory trees or as structured files. On the backend, a normal filesystem, with the aid of containers where appropriate, could serve as a temporary solution -- but in the long run, we probably want a special filesystem, allowing both efficient storage of complex structures and efficient access/update of individual items at the same time. I wonder whether this could be implemented as an extension of some existing filesystem, or rather some completely new approach is required... -antrik-