RE: [RFC] Possible design for "mount traps"

2000-05-03 Thread Jeremy Fitzhardinge


On 03-May-2000 Alexander Viro wrote:

> Thus the need to
> scan the whole tree from the autofs code, play the games with remounting
> stuff if expiry fails in the middle (somebody went into /mnt/net/foo while
> we were umounting /mnt/net/foo/bar and that made umount /mnt/net/foo
> fail; have to remount everything).

This doesn't happen because the kernel code makes sure the umount is
always possible once it told the daemon about it.  There's some of the recovery
code in the daemon, but it never gets run.

>   So what about the following trick: let's allow vfsmounts without
> associated superblock and allow to "mount" them even on the negative
> dentries? Notice that the latter will not break walk_name() - checks for
> dentry being negative are done after we try to follow mounts.
>   Notice also that once we mount something atop of such vfsmount it
> becomes completely invisible - it's wedged between two real objects and
> following mounts will walk through it without stopping.

This would be broadly useful for autofs, since its pretty much what's required
to implement direct mounts.  This would allow us to do incremental mount and
expiry of individual filesystems in the tree without having to do them en mass
like autofs4 currently does.  It also means we don't need a special autofs4
filesystem mounted, because we can garnish the namespace without it.  It
doesn't hurt that direct mounts are about the #1 requested feature.

I know hpa has been thinking about how to stack dentries.  How does this
compare?

BTW, what happens if you umount a filesystem which has these scattered about
its namespace?  Do they get cleaned up as part of the umount (appropriate
callback, etc), or do you need to clear them out before the umount?  I prefer
the former.

Also, what happens if you attach one to a non-directory?  Could you use it to
put arbiary "special files" into the namespace without having to do anything
special?  It would make thinks like Pavel's podfuk more useful without having
to do horrible namespace hacks as he does now.

Also, when one is inserted between two real filesystems, it still needs to be
able to mediate namespace lookups.  Autofs may need this to block access to a
filesystem while the daemon is umounting it.

>   These objects are not filesystems - they rather look like a traps
> set in the unified tree. Notice that they do not waste anon device like
> "one node autofs" would do.

That's not a huge issue, since you run out pretty quickly with NFS's
consumption.

>   Jeremy, would you be OK with keeping the information about
> difference between regular negatives and mountpoints-to-be that way?

I like it.  I've been thinking about something pretty similar, so I pretty much
know how I'd use it.

J



RE: [RFC] Possible design for "mount traps"

2000-05-03 Thread Alexander Viro



On Wed, 3 May 2000, Jeremy Fitzhardinge wrote:

> I know hpa has been thinking about how to stack dentries.  How does this
> compare?

Orthogonal. IIRC, hpa wanted them as a way to do loopbacks. Well, as soon
as tree scanning in autofs4 switches to new linkage/goes away[1] we are
getting much cheaper way to do loopbacks without mucking with dentries.
So the only stacking of any kind is in the mounpoint and you hardly can
get out without that... There may be other applications of dentry
stacking, but that's completely different story - these things are
independent and stacking would be a serious overhead for autofs* needs.

[1] I would really prefer the latter, but if it will be hard to do fast -
fine, it will be switch to new linkage; I have that code.

> BTW, what happens if you umount a filesystem which has these scattered about
> its namespace?  Do they get cleaned up as part of the umount (appropriate
> callback, etc), or do you need to clear them out before the umount?  I prefer
> the former.

Hrrrmmm... Probably the former, but I can argue it both ways ;-)

> Also, what happens if you attach one to a non-directory?  Could you use it to
> put arbiary "special files" into the namespace without having to do anything
> special?  It would make thinks like Pavel's podfuk more useful without having
> to do horrible namespace hacks as he does now.

Ummm... I'm not sure that I like the idea. Reason: I'm very suspicious of 
the situations when file turns into directory and back. I never seen it
done right and in all cases when it had been done it was full of nasty
special cases, kludges, etc. Mostly on the userland side of things, BTW.
If you can do it in clean way and nothing will break I'll be only glad
about that. Mechanism itself doesn't care for the type of that stuff, so I
have no objections on that side. Just a nasty gut feeling...

> Also, when one is inserted between two real filesystems, it still needs to be
> able to mediate namespace lookups.  Autofs may need this to block access to a
> filesystem while the daemon is umounting it.

It depends. How much are you going to do with the filesystem before
umount(8)?

> >   These objects are not filesystems - they rather look like a traps
> > set in the unified tree. Notice that they do not waste anon device like
> > "one node autofs" would do.
> 
> That's not a huge issue, since you run out pretty quickly with NFS's
> consumption.

 two times slower. At least something... 




RE: [RFC] Possible design for "mount traps"

2000-05-03 Thread Jeremy Fitzhardinge


On 03-May-2000 Alexander Viro wrote:
> as tree scanning in autofs4 switches to new linkage/goes away[1] we are
>
> [1] I would really prefer the latter, but if it will be hard to do fast -
> fine, it will be switch to new linkage; I have that code.

I'll happily get rid of the tree scanning if there's a better way of doing the
same thing.  I don't want to change the basic mechanism of autofs4 right at the
moment though.

>> BTW, what happens if you umount a filesystem which has these scattered about
>> its namespace?  Do they get cleaned up as part of the umount (appropriate
>> callback, etc), or do you need to clear them out before the umount?  I
>> prefer
>> the former.
> 
> Hrrrmmm... Probably the former, but I can argue it both ways ;-)

Well, you could get the latter behaviour from the former simply by holding an
extra reference and preventing the umount, but you can't simulate the former
from the latter.

>> Also, what happens if you attach one to a non-directory?  Could you use it
>> to put arbiary "special files" into the namespace without having to do
>> anything special?  It would make thinks like Pavel's podfuk more useful
>> without having to do horrible namespace hacks as he does now.
> 
> Ummm... I'm not sure that I like the idea. Reason: I'm very suspicious of 
> the situations when file turns into directory and back. I never seen it
> done right and in all cases when it had been done it was full of nasty
> special cases, kludges, etc. Mostly on the userland side of things, BTW.
> If you can do it in clean way and nothing will break I'll be only glad
> about that. Mechanism itself doesn't care for the type of that stuff, so I
> have no objections on that side. Just a nasty gut feeling...

Well, I don't see a good reason not to make a file respond to readdir.  chdir
and chroot currently prevent any-nondirectory from being current, so you need
to have a more general notion of directory-ness to make them work on magic
files.

The other approach is to have a file act like a symlink under some
circumstances, but I haven't thought that through properly.  That's essentially
what podfuk does at the moment, with its magic mapping to the /overlay tree.

Then there's the cases where all you want is some ordinary-looking files with
dynamic content.  That doesn't involve overlaying any incompatible semantics;
it just means you have a file in the namespace which isn't on the filesystem (I
guess you could get the same effect with a filesystem which has a file as the
top-level dentry, but I seem to remember that didn't work very well last time I
tried it).

>> Also, when one is inserted between two real filesystems, it still needs to
>> be able to mediate namespace lookups.  Autofs may need this to block access
>> to a filesystem while the daemon is umounting it.
> 
> It depends. How much are you going to do with the filesystem before
> umount(8)?

Probably not a lot, but I was thinking of something a bit more general than
autofs.

Actually, it would be nice to see transitions just to get a sense of how much
use a filesystem is getting.

>  two times slower. At least something... 

I suppose.  It would be nice find some replacement for the fake block devices
for blockless filesystems.

J



Re: [RFC] Possible design for "mount traps"

2000-05-03 Thread Richard Gooch

Alexander Viro writes:
>   Folks, I've tried to describe the stuff that may IMO become useful
> for autofs/devfs/portalfs/etc. Comments are more than welcome.
> 
> Current problems:
>   5. Any schemes with automount-like stuff in devfs require
> (union-)mount being triggered if lookup brings negative in all
> components already mounted. IOW, if the search gets to the last
> component of union-mount.

I think you're referring here to a "split" devfs, where each driver
exports a mini-devfs. In such an environment, your mount traps would
probably be good.

However, I don't think the mini-devfs idea is a good approach. There
are good reasons for having a unified tree. For one thing, there is
the issue of mounting /. For another, some drivers (i.e. cdrom) need
linkages (not just symlinks) into other parts of the devfs namespace.

Also, it would be hard (or impossible) for related drivers to share
the same directory (i.e. SCSI subsystem). At the least, there would
have to be more co-operation between drivers. Compare this to the
current devfs implementation where things are fairly modular and
independent.

>   So what about the following trick: let's allow vfsmounts without
> associated superblock and allow to "mount" them even on the negative
> dentries? Notice that the latter will not break walk_name() - checks for
> dentry being negative are done after we try to follow mounts.
>   Notice also that once we mount something atop of such vfsmount it
> becomes completely invisible - it's wedged between two real objects and
> following mounts will walk through it without stopping.
>   So the only case when these beasts count is when they are
> "mounted", but nothing is mounted atop of them. But that's precisely the
> class of situations we are interested in. In case of autofs we want
> follow_down() into such animal to trigger mounting, in case of portalfs -
> passing the rest of pathname to daemon, in case of devfs-with-automount
> we want to kick devfsd. So let them have a method that would be called
> upon such follow_down() (i.e. one when we have nothing mounted atop of
> us). And that's it.

>   These objects are not filesystems - they rather look like a
> traps set in the unified tree. Notice that they do not waste anon
> device like "one node autofs" would do.

This sounds a lot like the fake inodes I proposed a couple of years
ago to solve the autofs direct mount problem.

Anyway, while these mount traps are a good thing, particularly for
autofs, I don't think they're going to help simplify devfs (without
castrating devfs and probably breaking the Linus-mandated
namespace;-).

Regards,

Richard
Permanent: [EMAIL PROTECTED]
Current:   [EMAIL PROTECTED]



RE: [RFC] Possible design for "mount traps"

2000-05-04 Thread Petr Vandrovec

On  3 May 00 at 18:34, Alexander Viro wrote:
> > Also, what happens if you attach one to a non-directory?  Could you use it to
> > put arbiary "special files" into the namespace without having to do anything
> > special?  It would make thinks like Pavel's podfuk more useful without having
> > to do horrible namespace hacks as he does now.
> Ummm... I'm not sure that I like the idea. Reason: I'm very suspicious of 
> the situations when file turns into directory and back. I never seen it
> done right and in all cases when it had been done it was full of nasty
> special cases, kludges, etc. Mostly on the userland side of things, BTW.
> If you can do it in clean way and nothing will break I'll be only glad
> about that. Mechanism itself doesn't care for the type of that stuff, so I
> have no objections on that side. Just a nasty gut feeling...
Hi Alexander,
  if you are talking about turning inode into directory and back - what
is correct thing to do? I have simillar problem in ncpfs when someone on 
server removes file and create directory with same name instead. Or if
someone turns symlink to file or back :-(
Thanks,
Petr Vandrovec
[EMAIL PROTECTED]




RE: [RFC] Possible design for "mount traps"

2000-05-04 Thread Alexander Viro



On Wed, 3 May 2000, Jeremy Fitzhardinge wrote:

> I'll happily get rid of the tree scanning if there's a better way of doing the
> same thing.  I don't want to change the basic mechanism of autofs4 right at the
> moment though.

OK, then. In practical terms it means that right now autofs4 retains the
scanning but gets switched to vfsmount linkage, traps go after the
multiple-mount stuff and once they are in (if Linus approves such a
beast, that is) tree-scanning may go.

> > Hrrrmmm... Probably the former, but I can argue it both ways ;-)
> 
> Well, you could get the latter behaviour from the former simply by holding an
> extra reference and preventing the umount, but you can't simulate the former
> from the latter.

Yep, but it means two damn mechanisms for removing them - you definitely
should be able to remove them explicitly and do it without umounting the
host.

[snip files-as-directories - let's get back to that stuff when somebody
will stand up and say "I'm going to start doing it right now", OK? Again,
mechanism doesn't care, so it won't take large changes of in that area.]

> > It depends. How much are you going to do with the filesystem before
> > umount(8)?
> 
> Probably not a lot, but I was thinking of something a bit more general than
> autofs.
> 
> Actually, it would be nice to see transitions just to get a sense of how much
> use a filesystem is getting.

I'ld rather see it done completely from userland. Theory: currently
umount(2) is expensive. Really expensive. The main reason being that we
are trying to shrink dcache way too early - before we know that tree is
not busy.
It can be helped. Notice that opened files will prevent the call
of shrink_dcache_sb() right now - you'll be stopped by ->mnt_count on
vfsmnt. Lookup-in-process will have the same effect. The only source of
situations when we can get to shrink_dcache_sb() and fail umount(2) looks
so: we do lookup, put vfsmount, do something and only after that put the
dentry. Which can be trivially fixed if we postpone mntput() until after
the dput(). Then may_umount() becomes utterly trivial - it should just
check ->mnt_count.
With light-weight umount(2) we are in completely different
situation - then the expiry code may be safely moved into userland.

> >  two times slower. At least something... 
> 
> I suppose.  It would be nice find some replacement for the fake block devices
> for blockless filesystems.

Three words: stat(2). st_dev. POSIX.




Re: [RFC] Possible design for "mount traps"

2000-05-04 Thread Alexander Viro



On Wed, 3 May 2000, Richard Gooch wrote:

> I think you're referring here to a "split" devfs, where each driver
> exports a mini-devfs. In such an environment, your mount traps would
> probably be good.
> 
> However, I don't think the mini-devfs idea is a good approach. There
> are good reasons for having a unified tree. For one thing, there is
> the issue of mounting /. For another, some drivers (i.e. cdrom) need
> linkages (not just symlinks) into other parts of the devfs namespace.

Details, please? Notice that we are going to get equivalent of Plan 9
bind() RSN - it works in my tree right now and all I need to merge it into
the main tree is to sort out the autofs4 stuff. Well, since Jeremy wants
to postpone the autofs4 changes (i.e. not go for traps-based scheme right
now) - fine, I'm merging the autofs4 patches that switch the thing to new
linkage, toss in the ten-liner for knfsd (there linkage-related stuff is
minimal), test it and submit to Linus. mount -t bind goes immediately
after that. So getting the linkage between parts of unified tree and doing
that without any symlinks is trivial.

And what's up with mounting /?

> Also, it would be hard (or impossible) for related drivers to share
> the same directory (i.e. SCSI subsystem). At the least, there would
> have to be more co-operation between drivers. Compare this to the
> current devfs implementation where things are fairly modular and
> independent.

Why? union-mount their trees on /dev/scsi and you are done. No?

> Anyway, while these mount traps are a good thing, particularly for
> autofs, I don't think they're going to help simplify devfs (without
> castrating devfs and probably breaking the Linus-mandated
> namespace;-).

I wouldn't worry too much about the namespace breakage - check the bind(2)
manpage in Plan 9 to see what it is coming (their manpages are available
and searchable on http://www.freebsd.org/docs.html#man, along with the
manpages from a lot of other systems - kudos to Wolfram Schneider and
freebsd.org folks; very convenient resource they had put there). We'll get
tools that can repair such breakage.




Re: [RFC] Possible design for "mount traps"

2000-05-04 Thread Richard Gooch

Alexander Viro writes:
> 
> 
> On Wed, 3 May 2000, Richard Gooch wrote:
> 
> > I think you're referring here to a "split" devfs, where each driver
> > exports a mini-devfs. In such an environment, your mount traps would
> > probably be good.
> > 
> > However, I don't think the mini-devfs idea is a good approach. There
> > are good reasons for having a unified tree. For one thing, there is
> > the issue of mounting /. For another, some drivers (i.e. cdrom) need
> > linkages (not just symlinks) into other parts of the devfs namespace.
> 
> And what's up with mounting /?

Well, it depends on how you plan on making the VFS work. Firstly, I
need to be able to add and remove inodes without devfs being mounted
(so / can be mounted). Also, I need to be able to grab hold of an
inode by name.

Next, if the mini-mounts idea is used (a mini FS per driver), they
will all have to be brought together (need a new name for mounting
which isn't really mounting (or is it mounting, just not exposing to
user-space?)). The unified namespace needs to be available to the
kernel: we can't wait for user-space to do that.

> > Also, it would be hard (or impossible) for related drivers to share
> > the same directory (i.e. SCSI subsystem). At the least, there would
> > have to be more co-operation between drivers. Compare this to the
> > current devfs implementation where things are fairly modular and
> > independent.
> 
> Why? union-mount their trees on /dev/scsi and you are done. No?

No, I don't think that will work. The problem is that the high-level
drivers share a directory: the deepest directory. So the directory for
the "device" would be: /dev/scsi/host0/bus0/target0/lun0/
and then the sr_mod driver has to create the "cd" entry and the sg
driver has to create the "generic" entry. This is because the
high-level drivers present different "views" of the same SCSI device.

If we had the original devfs naming scheme of /dev/sr/ and /dev/sg/
and so forth, it would be easy to separate things. But with Linus'
naming scheme, it's not so simple (although I do agree that his scheme
is cleaner (but he had to drag me a bit:-)).

Anyway, tell me how you see this working with union mounting.

Also, please tell me what splitting devfs into a pile of mini-devfs'
buys us? If you're thinking that it will solve the multi-mount for
chroot(2) gaols, I don't think it will, because we want to control
things on an individual inode basis. Controlling things at the
directory (driver) level is too coarse-grained. Think /dev/null and
/dev/mem: both come from the same driver, both live in the same
directory.

What we need is a type of unionising overlay FS.

Regards,

Richard
Permanent: [EMAIL PROTECTED]
Current:   [EMAIL PROTECTED]



Re: [RFC] Possible design for "mount traps"

2000-05-10 Thread Erez Zadok

In message <[EMAIL PROTECTED]>, Alexander 
Viro writes:
[...]
>   So what about the following trick: let's allow vfsmounts without
> associated superblock and allow to "mount" them even on the negative
> dentries? Notice that the latter will not break walk_name() - checks for
> dentry being negative are done after we try to follow mounts.
>   Notice also that once we mount something atop of such vfsmount it
> becomes completely invisible - it's wedged between two real objects and
> following mounts will walk through it without stopping.
>   So the only case when these beasts count is when they are
> "mounted", but nothing is mounted atop of them. But that's precisely the
> class of situations we are interested in. In case of autofs we want
> follow_down() into such animal to trigger mounting, in case of portalfs -
> passing the rest of pathname to daemon, in case of devfs-with-automount
> we want to kick devfsd. So let them have a method that would be called
> upon such follow_down() (i.e. one when we have nothing mounted atop of
> us). And that's it.
>   These objects are not filesystems - they rather look like a traps
> set in the unified tree. Notice that they do not waste anon device like
> "one node autofs" would do.
>   That way if autofs daemon mounted /mnt/net/foo it would not follow
> up with /mnt/net/foo/bar - it would just set the trap in /mnt/net/foo/bar
> and let the actual lookups trigger further mounts.
[...]

This sounds almost identical to what Sun did to solve similar problems in
their first version of autofs.  There's a paper in LISA '99 describing their
enhancements to the original autofs.  Your proposal, however, is better b/c
it generalizes to more than autofs.

Erez.