Re: [RFC] Possible design for mount traps
In message [EMAIL PROTECTED], Alexander Viro writes: [...] So what about the following trick: let's allow vfsmounts without associated superblock and allow to "mount" them even on the negative dentries? Notice that the latter will not break walk_name() - checks for dentry being negative are done after we try to follow mounts. Notice also that once we mount something atop of such vfsmount it becomes completely invisible - it's wedged between two real objects and following mounts will walk through it without stopping. So the only case when these beasts count is when they are "mounted", but nothing is mounted atop of them. But that's precisely the class of situations we are interested in. In case of autofs we want follow_down() into such animal to trigger mounting, in case of portalfs - passing the rest of pathname to daemon, in case of devfs-with-automount we want to kick devfsd. So let them have a method that would be called upon such follow_down() (i.e. one when we have nothing mounted atop of us). And that's it. These objects are not filesystems - they rather look like a traps set in the unified tree. Notice that they do not waste anon device like "one node autofs" would do. That way if autofs daemon mounted /mnt/net/foo it would not follow up with /mnt/net/foo/bar - it would just set the trap in /mnt/net/foo/bar and let the actual lookups trigger further mounts. [...] This sounds almost identical to what Sun did to solve similar problems in their first version of autofs. There's a paper in LISA '99 describing their enhancements to the original autofs. Your proposal, however, is better b/c it generalizes to more than autofs. Erez.
RE: [RFC] Possible design for mount traps
On Wed, 3 May 2000, Jeremy Fitzhardinge wrote: I'll happily get rid of the tree scanning if there's a better way of doing the same thing. I don't want to change the basic mechanism of autofs4 right at the moment though. OK, then. In practical terms it means that right now autofs4 retains the scanning but gets switched to vfsmount linkage, traps go after the multiple-mount stuff and once they are in (if Linus approves such a beast, that is) tree-scanning may go. Hrrrmmm... Probably the former, but I can argue it both ways ;-) Well, you could get the latter behaviour from the former simply by holding an extra reference and preventing the umount, but you can't simulate the former from the latter. Yep, but it means two damn mechanisms for removing them - you definitely should be able to remove them explicitly and do it without umounting the host. [snip files-as-directories - let's get back to that stuff when somebody will stand up and say "I'm going to start doing it right now", OK? Again, mechanism doesn't care, so it won't take large changes of in that area.] It depends. How much are you going to do with the filesystem before umount(8)? Probably not a lot, but I was thinking of something a bit more general than autofs. Actually, it would be nice to see transitions just to get a sense of how much use a filesystem is getting. I'ld rather see it done completely from userland. Theory: currently umount(2) is expensive. Really expensive. The main reason being that we are trying to shrink dcache way too early - before we know that tree is not busy. It can be helped. Notice that opened files will prevent the call of shrink_dcache_sb() right now - you'll be stopped by -mnt_count on vfsmnt. Lookup-in-process will have the same effect. The only source of situations when we can get to shrink_dcache_sb() and fail umount(2) looks so: we do lookup, put vfsmount, do something and only after that put the dentry. Which can be trivially fixed if we postpone mntput() until after the dput(). Then may_umount() becomes utterly trivial - it should just check -mnt_count. With light-weight umount(2) we are in completely different situation - then the expiry code may be safely moved into userland. shrug two times slower. At least something... I suppose. It would be nice find some replacement for the fake block devices for blockless filesystems. Three words: stat(2). st_dev. POSIX.
Re: [RFC] Possible design for mount traps
On Wed, 3 May 2000, Richard Gooch wrote: I think you're referring here to a "split" devfs, where each driver exports a mini-devfs. In such an environment, your mount traps would probably be good. However, I don't think the mini-devfs idea is a good approach. There are good reasons for having a unified tree. For one thing, there is the issue of mounting /. For another, some drivers (i.e. cdrom) need linkages (not just symlinks) into other parts of the devfs namespace. Details, please? Notice that we are going to get equivalent of Plan 9 bind() RSN - it works in my tree right now and all I need to merge it into the main tree is to sort out the autofs4 stuff. Well, since Jeremy wants to postpone the autofs4 changes (i.e. not go for traps-based scheme right now) - fine, I'm merging the autofs4 patches that switch the thing to new linkage, toss in the ten-liner for knfsd (there linkage-related stuff is minimal), test it and submit to Linus. mount -t bind goes immediately after that. So getting the linkage between parts of unified tree and doing that without any symlinks is trivial. And what's up with mounting /? Also, it would be hard (or impossible) for related drivers to share the same directory (i.e. SCSI subsystem). At the least, there would have to be more co-operation between drivers. Compare this to the current devfs implementation where things are fairly modular and independent. Why? union-mount their trees on /dev/scsi and you are done. No? Anyway, while these mount traps are a good thing, particularly for autofs, I don't think they're going to help simplify devfs (without castrating devfs and probably breaking the Linus-mandated namespace;-). I wouldn't worry too much about the namespace breakage - check the bind(2) manpage in Plan 9 to see what it is coming (their manpages are available and searchable on http://www.freebsd.org/docs.html#man, along with the manpages from a lot of other systems - kudos to Wolfram Schneider and freebsd.org folks; very convenient resource they had put there). We'll get tools that can repair such breakage.
RE: [RFC] Possible design for mount traps
On 03-May-2000 Alexander Viro wrote: as tree scanning in autofs4 switches to new linkage/goes away[1] we are [1] I would really prefer the latter, but if it will be hard to do fast - fine, it will be switch to new linkage; I have that code. I'll happily get rid of the tree scanning if there's a better way of doing the same thing. I don't want to change the basic mechanism of autofs4 right at the moment though. BTW, what happens if you umount a filesystem which has these scattered about its namespace? Do they get cleaned up as part of the umount (appropriate callback, etc), or do you need to clear them out before the umount? I prefer the former. Hrrrmmm... Probably the former, but I can argue it both ways ;-) Well, you could get the latter behaviour from the former simply by holding an extra reference and preventing the umount, but you can't simulate the former from the latter. Also, what happens if you attach one to a non-directory? Could you use it to put arbiary "special files" into the namespace without having to do anything special? It would make thinks like Pavel's podfuk more useful without having to do horrible namespace hacks as he does now. Ummm... I'm not sure that I like the idea. Reason: I'm very suspicious of the situations when file turns into directory and back. I never seen it done right and in all cases when it had been done it was full of nasty special cases, kludges, etc. Mostly on the userland side of things, BTW. If you can do it in clean way and nothing will break I'll be only glad about that. Mechanism itself doesn't care for the type of that stuff, so I have no objections on that side. Just a nasty gut feeling... Well, I don't see a good reason not to make a file respond to readdir. chdir and chroot currently prevent any-nondirectory from being current, so you need to have a more general notion of directory-ness to make them work on magic files. The other approach is to have a file act like a symlink under some circumstances, but I haven't thought that through properly. That's essentially what podfuk does at the moment, with its magic mapping to the /overlay tree. Then there's the cases where all you want is some ordinary-looking files with dynamic content. That doesn't involve overlaying any incompatible semantics; it just means you have a file in the namespace which isn't on the filesystem (I guess you could get the same effect with a filesystem which has a file as the top-level dentry, but I seem to remember that didn't work very well last time I tried it). Also, when one is inserted between two real filesystems, it still needs to be able to mediate namespace lookups. Autofs may need this to block access to a filesystem while the daemon is umounting it. It depends. How much are you going to do with the filesystem before umount(8)? Probably not a lot, but I was thinking of something a bit more general than autofs. Actually, it would be nice to see transitions just to get a sense of how much use a filesystem is getting. shrug two times slower. At least something... I suppose. It would be nice find some replacement for the fake block devices for blockless filesystems. J
Re: [RFC] Possible design for mount traps
Alexander Viro writes: Folks, I've tried to describe the stuff that may IMO become useful for autofs/devfs/portalfs/etc. Comments are more than welcome. Current problems: 5. Any schemes with automount-like stuff in devfs require (union-)mount being triggered if lookup brings negative in all components already mounted. IOW, if the search gets to the last component of union-mount. I think you're referring here to a "split" devfs, where each driver exports a mini-devfs. In such an environment, your mount traps would probably be good. However, I don't think the mini-devfs idea is a good approach. There are good reasons for having a unified tree. For one thing, there is the issue of mounting /. For another, some drivers (i.e. cdrom) need linkages (not just symlinks) into other parts of the devfs namespace. Also, it would be hard (or impossible) for related drivers to share the same directory (i.e. SCSI subsystem). At the least, there would have to be more co-operation between drivers. Compare this to the current devfs implementation where things are fairly modular and independent. So what about the following trick: let's allow vfsmounts without associated superblock and allow to "mount" them even on the negative dentries? Notice that the latter will not break walk_name() - checks for dentry being negative are done after we try to follow mounts. Notice also that once we mount something atop of such vfsmount it becomes completely invisible - it's wedged between two real objects and following mounts will walk through it without stopping. So the only case when these beasts count is when they are "mounted", but nothing is mounted atop of them. But that's precisely the class of situations we are interested in. In case of autofs we want follow_down() into such animal to trigger mounting, in case of portalfs - passing the rest of pathname to daemon, in case of devfs-with-automount we want to kick devfsd. So let them have a method that would be called upon such follow_down() (i.e. one when we have nothing mounted atop of us). And that's it. These objects are not filesystems - they rather look like a traps set in the unified tree. Notice that they do not waste anon device like "one node autofs" would do. This sounds a lot like the fake inodes I proposed a couple of years ago to solve the autofs direct mount problem. Anyway, while these mount traps are a good thing, particularly for autofs, I don't think they're going to help simplify devfs (without castrating devfs and probably breaking the Linus-mandated namespace;-). Regards, Richard Permanent: [EMAIL PROTECTED] Current: [EMAIL PROTECTED]