Re: FS_SINGLE queries
Tigran Aivazian writes: > On Thu, 29 Jun 2000, Richard Gooch wrote: > > What happens when you try (user-space) mounting a FS_SINGLE filesystem > > without calling kern_mount() first? > > I get an oops at the line of code I mentioned - it wasn't a theoretical > oops but a practical one :) > > Basically, I was documenting file_system_type structure and wrote: > > read_super - a pointer to the function that reads the super > block during mount operation. This function is required - if it is > not > provided, mount operation (whether from userspace or inkernel) will > always fail except in FS_SINGLE case where it will Oops in > get_sb_single() trying to dereference a NULL pointer in > fs_type->kern_mnt->mnt_sb with (fs_type->kern_mnt = NULL) if the > module did not call kern_mount() in the initialisation routine after > filesystem was successfully registered by register_filesystem(). > > Now, it would sound much nicer if I could just say instead: > > > read_super - a pointer to the function that reads the super > block during mount operation. This function is required - if it is > not provided, mount operation (whether from userspace or > inkernel) will always fail. > > Don't you agree? So, to test it I wrote a trivial filesystem that > sets FS_SINGLE and yet provides no read_super in fs_type and > discovered the oops. Then I added a dummy (always failing) > read_super() and it oops'd exactly the same - so it doesn't matter > if read_super is NULL or not for this thing (which reading > read_super() function suggests anyway). Hm. I agree that the documentation for read_super() would best not have to mention FS_SINGLE/kern_mount() issues. However, these issues should be discussed. I think you just need to document that at a higher level. That will be "clean". Regards, Richard Permanent: [EMAIL PROTECTED] Current: [EMAIL PROTECTED]
Re: FS_SINGLE queries
On Thu, 29 Jun 2000, Richard Gooch wrote: > What happens when you try (user-space) mounting a FS_SINGLE filesystem > without calling kern_mount() first? I get an oops at the line of code I mentioned - it wasn't a theoretical oops but a practical one :) Basically, I was documenting file_system_type structure and wrote: read_super - a pointer to the function that reads the super block during mount operation. This function is required - if it is not provided, mount operation (whether from userspace or inkernel) will always fail except in FS_SINGLE case where it will Oops in get_sb_single() trying to dereference a NULL pointer in fs_type->kern_mnt->mnt_sb with (fs_type->kern_mnt = NULL) if the module did not call kern_mount() in the initialisation routine after filesystem was successfully registered by register_filesystem(). Now, it would sound much nicer if I could just say instead: read_super - a pointer to the function that reads the super block during mount operation. This function is required - if it is not provided, mount operation (whether from userspace or inkernel) will always fail. Don't you agree? So, to test it I wrote a trivial filesystem that sets FS_SINGLE and yet provides no read_super in fs_type and discovered the oops. Then I added a dummy (always failing) read_super() and it oops'd exactly the same - so it doesn't matter if read_super is NULL or not for this thing (which reading read_super() function suggests anyway). Regards, Tigran
Re: FS_SINGLE queries
Tigran Aivazian writes: > On Thu, 29 Jun 2000, Richard Gooch wrote: > > Hm. Digging back into my archives, I see I said I got a kernel BUG. So > > that means I got a BUG, not an Oops. Perhaps that means that *fs_type > > hasn't been initialised to 0, or perhaps that fs_type->kern_mnt gets > > initialised elsewhere even when kern_mount() isn't called (and perhaps > > kern_mount() just initialises fs_type->kern_mnt->mnt_sb). > > Speculations only: I haven't RTFS. > > all I am really saying is that this simple filesystem should generate a > BUG() (pointing to the fact that it should be kern_mount-ed first) and not > an oops: I agree one should get a BUG. And that's what I got. So although the piece of code you looked at suggests you'd get an Oops instead, I suspect there more be something more subtle happening. What happens when you try (user-space) mounting a FS_SINGLE filesystem without calling kern_mount() first? Regards, Richard Permanent: [EMAIL PROTECTED] Current: [EMAIL PROTECTED]
Re: FS_SINGLE queries
On Thu, 29 Jun 2000, Richard Gooch wrote: > Tigran Aivazian writes: > > On Sat, 10 Jun 2000, Alexander Viro wrote: > > > > - although not documented, you need to do kern_mount() before trying > > > Yup. > > > > normal mounts of a FS_SINGLE; perhaps kern_mount()/kern_umount() > > > > should be called automatically in > > > > register_filesystem()/unregister_filesystem()? > > > > > > I don't think so. They are different operations and I'm not too happy > > > about mixing them together. Matter of taste, but... > > > > In get_sb_single() you wrote: > > > > sb = fs_type->kern_mnt->mnt_sb; > > if (!sb) > > BUG(); > > > > and it is kern_mount() that initialises type->kern_mnt. So, if one forgot > > to kern_mount a FS_SINGLE filesystem prior to letting userspace try to > > mount(2) it, then it is not the BUG() that we hit but an oops of this > > kind: > > > > Code; c013c6b1<= > >0: 8b 58 1c mov0x1c(%eax),%ebx <= > > > > (0x1c being offset of mnt_sb in vfsmount) > > > > i.e. maybe we should really have in get_sb_single(): > > > > if (!fs_type->kern_mnt || !(sb = fs_type->kern_mnt->mnt_sb)) > > BUG(); > > > > I.e. if one forgot to kern_mount then fs_type->kern_mnt will be probably > > left at NULL so one is more likely to follow a NULL pointer via ->kern_mnt > > rather that follow somewhere valid and then find NULL via ->mnt_sb? > > > > Richard, how is it that you actually hit the BUG() above? > > Hm. Digging back into my archives, I see I said I got a kernel BUG. So > that means I got a BUG, not an Oops. Perhaps that means that *fs_type > hasn't been initialised to 0, or perhaps that fs_type->kern_mnt gets > initialised elsewhere even when kern_mount() isn't called (and perhaps > kern_mount() just initialises fs_type->kern_mnt->mnt_sb). > Speculations only: I haven't RTFS. all I am really saying is that this simple filesystem should generate a BUG() (pointing to the fact that it should be kern_mount-ed first) and not an oops: static DECLARE_FSTYPE(single_fs_type, "single", NULL, FS_SINGLE); static int __init init_single_fs(void) { return register_filesystem(&single_fs_type); } static void __exit exit_single_fs(void) { unregister_filesystem(&single_fs_type); } Regards, Tigran
Re: FS_SINGLE queries
Tigran Aivazian writes: > On Sat, 10 Jun 2000, Alexander Viro wrote: > > > - although not documented, you need to do kern_mount() before trying > > Yup. > > > normal mounts of a FS_SINGLE; perhaps kern_mount()/kern_umount() > > > should be called automatically in > > > register_filesystem()/unregister_filesystem()? > > > > I don't think so. They are different operations and I'm not too happy > > about mixing them together. Matter of taste, but... > > In get_sb_single() you wrote: > > sb = fs_type->kern_mnt->mnt_sb; > if (!sb) > BUG(); > > and it is kern_mount() that initialises type->kern_mnt. So, if one forgot > to kern_mount a FS_SINGLE filesystem prior to letting userspace try to > mount(2) it, then it is not the BUG() that we hit but an oops of this > kind: > > Code; c013c6b1<= >0: 8b 58 1c mov0x1c(%eax),%ebx <= > > (0x1c being offset of mnt_sb in vfsmount) > > i.e. maybe we should really have in get_sb_single(): > > if (!fs_type->kern_mnt || !(sb = fs_type->kern_mnt->mnt_sb)) > BUG(); > > I.e. if one forgot to kern_mount then fs_type->kern_mnt will be probably > left at NULL so one is more likely to follow a NULL pointer via ->kern_mnt > rather that follow somewhere valid and then find NULL via ->mnt_sb? > > Richard, how is it that you actually hit the BUG() above? Hm. Digging back into my archives, I see I said I got a kernel BUG. So that means I got a BUG, not an Oops. Perhaps that means that *fs_type hasn't been initialised to 0, or perhaps that fs_type->kern_mnt gets initialised elsewhere even when kern_mount() isn't called (and perhaps kern_mount() just initialises fs_type->kern_mnt->mnt_sb). Speculations only: I haven't RTFS. Regards, Richard Permanent: [EMAIL PROTECTED] Current: [EMAIL PROTECTED]
Re: FS_SINGLE queries
On Sat, 10 Jun 2000, Alexander Viro wrote: > > - although not documented, you need to do kern_mount() before trying > Yup. > > normal mounts of a FS_SINGLE; perhaps kern_mount()/kern_umount() > > should be called automatically in > > register_filesystem()/unregister_filesystem()? > > I don't think so. They are different operations and I'm not too happy > about mixing them together. Matter of taste, but... In get_sb_single() you wrote: sb = fs_type->kern_mnt->mnt_sb; if (!sb) BUG(); and it is kern_mount() that initialises type->kern_mnt. So, if one forgot to kern_mount a FS_SINGLE filesystem prior to letting userspace try to mount(2) it, then it is not the BUG() that we hit but an oops of this kind: Code; c013c6b1<= 0: 8b 58 1c mov0x1c(%eax),%ebx <= (0x1c being offset of mnt_sb in vfsmount) i.e. maybe we should really have in get_sb_single(): if (!fs_type->kern_mnt || !(sb = fs_type->kern_mnt->mnt_sb)) BUG(); I.e. if one forgot to kern_mount then fs_type->kern_mnt will be probably left at NULL so one is more likely to follow a NULL pointer via ->kern_mnt rather that follow somewhere valid and then find NULL via ->mnt_sb? Richard, how is it that you actually hit the BUG() above? Regards, Tigran
Re: FS_SINGLE queries
On Fri, 16 Jun 2000, Richard Gooch wrote: > - mount procfs on boot :-> > > - modify init(8) to not need /dev/tty (which would be a pity, because > session management before starting boot scripts is damn handy) > > - put all the virtual filesystems under a (known, fixed) kernel > namespace and allow a way to access that namespace from user-space > (alternative to mounting procfs at boot, so that we don't get > bleatings about "policy"). > > Tsk, tsk... Too obvious one - consider taking lessons from Albert...
Re: FS_SINGLE queries
Alexander Viro writes: > On Fri, 16 Jun 2000, Richard Gooch wrote: > > > - mount procfs on boot :-> > > > > - modify init(8) to not need /dev/tty (which would be a pity, because > > session management before starting boot scripts is damn handy) > > > > - put all the virtual filesystems under a (known, fixed) kernel > > namespace and allow a way to access that namespace from user-space > > (alternative to mounting procfs at boot, so that we don't get > > bleatings about "policy"). > > > > > > Tsk, tsk... Too obvious one - consider taking lessons from Albert... Bugger. That was no fun at all. Regards, Richard Permanent: [EMAIL PROTECTED] Current: [EMAIL PROTECTED]
Re: FS_SINGLE queries
Alexander Viro writes: > On Fri, 16 Jun 2000, Richard Gooch wrote: > > Agreed. /dev/tty always struck me as a bit evil^Wmagic. At the very > > least, a symlink to /proc/self/tty would make it pretty damn clear > > even to a novice. > > Unfortunately, unlike /proc/mounts, /dev/tty has to be avalaible > before mounting procfs. Alas ;-< Some solutions: - mount procfs on boot :-> - modify init(8) to not need /dev/tty (which would be a pity, because session management before starting boot scripts is damn handy) - put all the virtual filesystems under a (known, fixed) kernel namespace and allow a way to access that namespace from user-space (alternative to mounting procfs at boot, so that we don't get bleatings about "policy"). Regards, Richard Permanent: [EMAIL PROTECTED] Current: [EMAIL PROTECTED]
Re: FS_SINGLE queries
On Fri, 16 Jun 2000, Richard Gooch wrote: > Alexander Viro writes: > > > > > > On Fri, 16 Jun 2000, Erez Zadok wrote: > > > > > Hey, we can make it yet another ioctl(2). Then we can trade a crapload of > > > syscalls for a crapload of ioctls --- a time-honored Unix tradition... :-) > > > :-) > > > > > > Seriously, an open/read/.../close would work fine, but on what file? If > > > it's something inside /proc, fine, but has the Linux community as a whole > > > accepted that procfs is a *must* for any working system "or else"? If the > > > file to open/read/close won't be in /proc, what type of file it'd be and how > > > it'd get created? > > > > Depends. If we have per-process namespaces - procfs is the only way > > to go, simply because there is no such thing as system-wide set of > > mounts. However, that procfs will not have to contain anything but > > per-process data + /proc/self. Another variant is a mechanism a-la > > /dev/tty, but frankly, I would rather see /dev/tty being a symlink > > to /proc/self/tty... > > Agreed. /dev/tty always struck me as a bit evil^Wmagic. At the very > least, a symlink to /proc/self/tty would make it pretty damn clear > even to a novice. Unfortunately, unlike /proc/mounts, /dev/tty has to be avalaible before mounting procfs. Alas ;-<
Re: FS_SINGLE queries
Alexander Viro writes: > > > On Fri, 16 Jun 2000, Erez Zadok wrote: > > > Hey, we can make it yet another ioctl(2). Then we can trade a crapload of > > syscalls for a crapload of ioctls --- a time-honored Unix tradition... :-) > > :-) > > > > Seriously, an open/read/.../close would work fine, but on what file? If > > it's something inside /proc, fine, but has the Linux community as a whole > > accepted that procfs is a *must* for any working system "or else"? If the > > file to open/read/close won't be in /proc, what type of file it'd be and how > > it'd get created? > > Depends. If we have per-process namespaces - procfs is the only way > to go, simply because there is no such thing as system-wide set of > mounts. However, that procfs will not have to contain anything but > per-process data + /proc/self. Another variant is a mechanism a-la > /dev/tty, but frankly, I would rather see /dev/tty being a symlink > to /proc/self/tty... Agreed. /dev/tty always struck me as a bit evil^Wmagic. At the very least, a symlink to /proc/self/tty would make it pretty damn clear even to a novice. Regards, Richard Permanent: [EMAIL PROTECTED] Current: [EMAIL PROTECTED]
Re: FS_SINGLE queries
On Fri, 16 Jun 2000, Erez Zadok wrote: > Hey, we can make it yet another ioctl(2). Then we can trade a crapload of > syscalls for a crapload of ioctls --- a time-honored Unix tradition... :-) > :-) > > Seriously, an open/read/.../close would work fine, but on what file? If > it's something inside /proc, fine, but has the Linux community as a whole > accepted that procfs is a *must* for any working system "or else"? If the > file to open/read/close won't be in /proc, what type of file it'd be and how > it'd get created? Depends. If we have per-process namespaces - procfs is the only way to go, simply because there is no such thing as system-wide set of mounts. However, that procfs will not have to contain anything but per-process data + /proc/self. Another variant is a mechanism a-la /dev/tty, but frankly, I would rather see /dev/tty being a symlink to /proc/self/tty...
Re: FS_SINGLE queries
In message <[EMAIL PROTECTED]>, Alexander Viro writes: > > > On Fri, 16 Jun 2000, Erez Zadok wrote: > [...] > > Anyway, I'd like to see a new syscall that returns a list of mounts and > > Sigh... We already have a crapload of syscalls that should not be there. > If it can be done by open()/read()/write()/lseek()/close() it should be > done that way. Hey, we can make it yet another ioctl(2). Then we can trade a crapload of syscalls for a crapload of ioctls --- a time-honored Unix tradition... :-) :-) Seriously, an open/read/.../close would work fine, but on what file? If it's something inside /proc, fine, but has the Linux community as a whole accepted that procfs is a *must* for any working system "or else"? If the file to open/read/close won't be in /proc, what type of file it'd be and how it'd get created? Erez.
Re: FS_SINGLE queries
On Fri, 16 Jun 2000, Erez Zadok wrote: > > I'm not sure that we need to keep it on procfs - especially with the > > union-mounts coming into the game. > > Procfs or not, I'm advocating for keeping it in the kernel only, where it > belongs, and removing the kludgy need (ala Sun and many others) to maintain > a separate /etc/mtab file. Oh, definitely. > Anyway, I'd like to see a new syscall that returns a list of mounts and Sigh... We already have a crapload of syscalls that should not be there. If it can be done by open()/read()/write()/lseek()/close() it should be done that way. > associated info in linux. Currently that can be done by reading > /proc/mounts, but not if procfs isn't available or we're going to take > /proc/mounts away. /proc//ns. And we can't remove /proc/mount (albeit we can take all crap out of procfs and union-mount it over the thing).
Re: FS_SINGLE queries
In message <[EMAIL PROTECTED]>, [EMAIL PROTECTED] writes: [...] > so mount could keep a /etc/mtab2 to record this informatoin, but that's > freaking ugly. or we could pass a new mount option down into the kernel > which causes it to display `loop' in that entry, bu this seems like a > waste of a bit. other alternatives gladly sought. Not necessarily. Several OSs use an "ignore" bit as a mount flag telling programs like df(1) not to stat certain entries by default. This is often used for automounted/autofs entries, where normally no reasonable info can be returned to statvfs(2), plus it's a good idea not to slow df(1) by stating file systems that may be served by slow user-level file servers (amd w/o autofs support). There are cases where such file servers can return useful info back to statvfs(2) (as amd can). BTW, the usual reason you don't see such automounted entries is that GNU df automatically will not list entries with statfs values of 0, but it still will statfs(2) them which will be slow (and hand if the automounter is hung). It's much better if the kernel can record that certain entries were mounted w/ the "ignore" option, and ensure that df(1) simply doesn't statfs's 'em at all. Erez.
Re: FS_SINGLE queries
On Fri, Jun 16, 2000 at 02:15:32PM +0100, Tigran Aivazian wrote: > while we are on the subject of obsoleting /etc/mtab in favour of > /proc/mounts (or enhanced version thereof), please keep in mind the > classical problem of "mount -o loop" stopping to work if you use > /proc/mounts. I think Andries Brouwer explained in the past why was that > the case but I can't remember the reason. The problem was that loopback > module's refcount will leak (up) when using /proc/mounts unless you > manually losetup -d appropriate number of times. oh, the reason is pretty straightforward: mount(8) writes `loop' into the mount options if /etc/mtab isn't a symlink to /proc/mounts. if it is, it can't modify /proc/mounts. umount(8) will automatically delete the underlying loop device if the `loop' keyword is there. i suspect it doesn't autodelete the device based on the device matching /dev/lo* because the user may have set it up by hand and be very unhappy about mount trying to be too clever and delete it. so mount could keep a /etc/mtab2 to record this informatoin, but that's freaking ugly. or we could pass a new mount option down into the kernel which causes it to display `loop' in that entry, bu this seems like a waste of a bit. other alternatives gladly sought. something i hope to get time to look at next week is redoing the loop device to take advantage of the LVM hooks. it might be significantly faster than the current system. -- The Sex Pistols were revolutionaries. The Bay City Rollers weren't.
Re: FS_SINGLE queries
On Fri, 16 Jun 2000, Alexander Viro wrote: > On Fri, 16 Jun 2000, Erez Zadok wrote: > > > On a related note, since we do have /proc/mounts, and assuming that procfs > > is pretty much necessary nowadays, are we going to get rid of /etc/mtab and > > completely move all getmntent info into the kernel? I never liked the fact > > that people doing mounts (such as automounters) have to ensure that they > > correctly maintain a separate text file in /etc. > > I'm not sure that we need to keep it on procfs - especially with the > union-mounts coming into the game. > ... > > Hmmm, maybe that's a question to the glibc folks. I guess as long as all > > the necessary tools and libraries will use /proc/mounts if available, and > > avoid using /etc/mtab, that'd be ok. > > How many programs actually need this getmntent(), in the first > place? > while we are on the subject of obsoleting /etc/mtab in favour of /proc/mounts (or enhanced version thereof), please keep in mind the classical problem of "mount -o loop" stopping to work if you use /proc/mounts. I think Andries Brouwer explained in the past why was that the case but I can't remember the reason. The problem was that loopback module's refcount will leak (up) when using /proc/mounts unless you manually losetup -d appropriate number of times. Regardsm Tigran
Re: FS_SINGLE queries
In message <[EMAIL PROTECTED]>, Alexander Viro writes: > > > On Fri, 16 Jun 2000, Erez Zadok wrote: > > > On a related note, since we do have /proc/mounts, and assuming that procfs > > is pretty much necessary nowadays, are we going to get rid of /etc/mtab and > > completely move all getmntent info into the kernel? I never liked the fact > > that people doing mounts (such as automounters) have to ensure that they > > correctly maintain a separate text file in /etc. > > I'm not sure that we need to keep it on procfs - especially with the > union-mounts coming into the game. Procfs or not, I'm advocating for keeping it in the kernel only, where it belongs, and removing the kludgy need (ala Sun and many others) to maintain a separate /etc/mtab file. > > Hmmm, maybe that's a question to the glibc folks. I guess as long as all > > the necessary tools and libraries will use /proc/mounts if available, and > > avoid using /etc/mtab, that'd be ok. > > How many programs actually need this getmntent(), in the first > place? Programs like df(1) need to read mtab. Automounters (such as amd, which I maintain) and /bin/mount need to write it. The problem with a separate mtab file is that there's no way to guarantee that the file in /etc is in sync w/ the actual mounts in the kernel. There are many reasons why you can get an mtab file that's out of sync w/ the actual in-kernel mounts. AIX, Ultrix, and BSD44 did the right thing by moving this mtab list into the kernel, and rewriting "[gs]etmntent" (they also renamed them) so they query the kernel via a syscall. Solaris 8 move that way too, but kept backwards compatibility using their special mntfs. Anyway, I'd like to see a new syscall that returns a list of mounts and associated info in linux. Currently that can be done by reading /proc/mounts, but not if procfs isn't available or we're going to take /proc/mounts away. It would make programs like df more reliable, and programs like /bin/mount won't have to rewrite the mtab file each time a mount(2) is made. And it'll make amd work a little faster (I already auto-detect in-kernel vs. in-/etc mount tables and handle that in amd). Anyway it's not a big thing or something that we need to do right now. Erez.
Re: FS_SINGLE queries
On Fri, 16 Jun 2000, Erez Zadok wrote: > On a related note, since we do have /proc/mounts, and assuming that procfs > is pretty much necessary nowadays, are we going to get rid of /etc/mtab and > completely move all getmntent info into the kernel? I never liked the fact > that people doing mounts (such as automounters) have to ensure that they > correctly maintain a separate text file in /etc. I'm not sure that we need to keep it on procfs - especially with the union-mounts coming into the game. > If we want to go crazy, we can implement mntfs ala Solaris 8, which moved > the mnt info into the kernel, but allowed for "editing" /etc/mnttab which is > now a special f/s mounted on top of a single file. I'ld rather not do it. I know that I'm biased, but Sun seems to be seriously bound on turning their system into a tasteless pile of kludges. I wouldn't trust SunSoft to design a loo cover - it would weight a ton, come with HTML manual, require a complex system of levers to raise and have a nasty habit of coming down _hard_ in the most inconvenient moments. Sorry - still bitter over their switch from SunOS 4 to Slowlartus... > Hmmm, maybe that's a question to the glibc folks. I guess as long as all > the necessary tools and libraries will use /proc/mounts if available, and > avoid using /etc/mtab, that'd be ok. How many programs actually need this getmntent(), in the first place?
Re: FS_SINGLE queries
In message <[EMAIL PROTECTED]>, Alexander Viro writes: > > > On Sat, 10 Jun 2000, Richard Gooch wrote: > > > I see your point. However, that suggests that the naming of > > /proc/mounts is wrong. Perhaps we should have a /proc/namespace that > > shows all these VFS bindings, and separately a list of real mounts. > > What's "real"? /proc/mounts would better left as it was (funny replacement > for /etc/mtab) and there should be something along the lines of > /proc/namespace (hell knows, we might make it compatible with /proc/ns > from new Plan 9). That something most definitely doesn't need to share the > format with /proc/mounts... On a related note, since we do have /proc/mounts, and assuming that procfs is pretty much necessary nowadays, are we going to get rid of /etc/mtab and completely move all getmntent info into the kernel? I never liked the fact that people doing mounts (such as automounters) have to ensure that they correctly maintain a separate text file in /etc. If we want to go crazy, we can implement mntfs ala Solaris 8, which moved the mnt info into the kernel, but allowed for "editing" /etc/mnttab which is now a special f/s mounted on top of a single file. Hmmm, maybe that's a question to the glibc folks. I guess as long as all the necessary tools and libraries will use /proc/mounts if available, and avoid using /etc/mtab, that'd be ok. Erez.
Re: FS_SINGLE queries
Alexander Viro writes: > On Sat, 10 Jun 2000, Richard Gooch wrote: > > Will it really make much difference? What would be harder to do > > without mount IDs? And how much harder? > > Beware of functions with many arguments... Besides, what about "kill > the component of union-mount on /barf NFS-mounted from > venus:/foo/bar"? What exactly are you going to pass here? Such > stuff is better left to userland. Let's see. Pass the same stuff you see in /proc/namespace? Instead of cut-and-paste of the mount ID, cut-and-paste the other entries on that line. You're making the decision based on what's in /proc/namespace anyway. Why add another level of indirection? > > > And then... consider the situation when root logs in and decides to > > > mess with luser's namespace. > > > > What about it? > > bastard@venus% su - > Password: > root@venus% w luser > > luser pts/0 ... > root@venus% ps t pts/0 > > > 728 pts/0 ... > root@venus% cat /proc/728/ns > > > 123749/home/luser/foo / nfs > root@venus% umount -I 123749 > root@venus% logout > bastard@venus% mail luser > Subject: you've been told to umount ~/foo > > ^D > > > Avoiding numbers is a good thing. They have no intrinsic meaning. > > Tell that to guys who invented file descriptors. IMO that works > quite fine - I'll prefer to do close(17) rather than incantations of horror needed on OS/360> type that stuff late on Saturday> But FDs are different. You reference the file by name, and the system returns an opaque handle. But the system isn't returning a mount ID as the result of some operation: you had to scan a file for it. So really your handle to the mount is something else, the mount ID is merely another level of indirection. Regards, Richard Permanent: [EMAIL PROTECTED] Current: [EMAIL PROTECTED]
Re: FS_SINGLE queries
On Sat, 10 Jun 2000, Richard Gooch wrote: > Will it really make much difference? What would be harder to do > without mount IDs? And how much harder? Beware of functions with many arguments... Besides, what about "kill the component of union-mount on /barf NFS-mounted from venus:/foo/bar"? What exactly are you going to pass here? Such stuff is better left to userland. > > And then... consider the situation when root logs in and decides to > > mess with luser's namespace. > > What about it? bastard@venus% su - Password: root@venus% w luser luser pts/0 ... root@venus% ps t pts/0 728 pts/0 ... root@venus% cat /proc/728/ns 123749 /home/luser/foo / nfs root@venus% umount -I 123749 root@venus% logout bastard@venus% mail luser Subject: you've been told to umount ~/foo ^D > Avoiding numbers is a good thing. They have no intrinsic meaning. Tell that to guys who invented file descriptors. IMO that works quite fine - I'll prefer to do close(17) rather than
Re: FS_SINGLE queries
Alexander Viro writes: > > > On Sat, 10 Jun 2000, Richard Gooch wrote: > > > Yeah, sure. I did say "for example". Your format looks fine. One > > question: is the mount ID really needed? Can't you distinguish based > > on what FS you're mounting (and mountpoint root)? > > First of all, interface is simpler that way. Will it really make much difference? What would be harder to do without mount IDs? And how much harder? > And then... consider the situation when root logs in and decides to > mess with luser's namespace. What about it? Avoiding numbers is a good thing. They have no intrinsic meaning. Regards, Richard Permanent: [EMAIL PROTECTED] Current: [EMAIL PROTECTED]
Re: FS_SINGLE queries
On Sat, 10 Jun 2000, Richard Gooch wrote: > Yeah, sure. I did say "for example". Your format looks fine. One > question: is the mount ID really needed? Can't you distinguish based > on what FS you're mounting (and mountpoint root)? First of all, interface is simpler that way. And then... consider the situation when root logs in and decides to mess with luser's namespace.
Re: FS_SINGLE queries
Alexander Viro writes: > > > On Sat, 10 Jun 2000, Richard Gooch wrote: > > > What I mean by "real" mounts is a table that shows how each FS was > > brought into the namespace (or each namespace, once you implement > > CLONE_NEWNS). So for example: > > #device filesystem roots > > /dev/hda1 ext2/ > > /dev/hda2 ext2/var/spool/mail /gaol/var/spool/mail > > noneproc/proc /gaol/proc > > Bad format. If anything, it should contain mount IDs (if you want to have > union-mount you need those, just to be able to take away components). > The following might go: > > 1 / / ext2/dev/hda1 > 2 /var/spool/mail / ext2/dev/hda2 > 3 /proc / procfs > 14/gaol/var/spool/mail/ ext2/dev/hda2 > 15/gaol/proc / procfs > 42/gaol/lib/libc.2.1.3.so /lib/libc.2.1.3.so ext2/dev/hda1 > ... > > IOW, ID + mountpoint + location of root in its tree + fs type + > fs-specific parameters. That at least allows to reproduce the > namespace. And yes, IMO "device" is fs-specific parameter. Yeah, sure. I did say "for example". Your format looks fine. One question: is the mount ID really needed? Can't you distinguish based on what FS you're mounting (and mountpoint root)? BTW: I agree that device is fs-specific. It's much nicer to see "none" done away with. Regards, Richard Permanent: [EMAIL PROTECTED] Current: [EMAIL PROTECTED]
Re: FS_SINGLE queries
On Sat, 10 Jun 2000, Richard Gooch wrote: > What I mean by "real" mounts is a table that shows how each FS was > brought into the namespace (or each namespace, once you implement > CLONE_NEWNS). So for example: > #device filesystem roots > /dev/hda1 ext2/ > /dev/hda2 ext2/var/spool/mail /gaol/var/spool/mail > none proc/proc /gaol/proc Bad format. If anything, it should contain mount IDs (if you want to have union-mount you need those, just to be able to take away components). The following might go: 1 / / ext2/dev/hda1 2 /var/spool/mail / ext2/dev/hda2 3 /proc / procfs 14 /gaol/var/spool/mail/ ext2/dev/hda2 15 /gaol/proc / procfs 42 /gaol/lib/libc.2.1.3.so /lib/libc.2.1.3.so ext2/dev/hda1 ... IOW, ID + mountpoint + location of root in its tree + fs type + fs-specific parameters. That at least allows to reproduce the namespace. And yes, IMO "device" is fs-specific parameter.
Re: FS_SINGLE queries
Alexander Viro writes: > > > On Sat, 10 Jun 2000, Richard Gooch wrote: > > > I see your point. However, that suggests that the naming of > > /proc/mounts is wrong. Perhaps we should have a /proc/namespace that > > shows all these VFS bindings, and separately a list of real mounts. > > What's "real"? /proc/mounts would better left as it was (funny > replacement for /etc/mtab) and there should be something along the > lines of /proc/namespace (hell knows, we might make it compatible > with /proc/ns from new Plan 9). That something most definitely > doesn't need to share the format with /proc/mounts... What I mean by "real" mounts is a table that shows how each FS was brought into the namespace (or each namespace, once you implement CLONE_NEWNS). So for example: #device filesystem roots /dev/hda1 ext2/ /dev/hda2 ext2/var/spool/mail /gaol/var/spool/mail noneproc/proc /gaol/proc in /proc/namespace. And I suppose that /proc/namespace would be unique for each namespace as well. This way, no distinction is made between the first mount and subsequent bindings, which is what you'd like, as I gather you'd like to make all bindings equal. Aside: I guess the reality is that the first binding (the original mount -t ext2) is more equal than the subsequent bindings (mount -t bind). Evidence: the O_CREAT bug I found the other day ;-) Regards, Richard Permanent: [EMAIL PROTECTED] Current: [EMAIL PROTECTED]
Re: FS_SINGLE queries
On Sat, 10 Jun 2000, Richard Gooch wrote: > I see your point. However, that suggests that the naming of > /proc/mounts is wrong. Perhaps we should have a /proc/namespace that > shows all these VFS bindings, and separately a list of real mounts. What's "real"? /proc/mounts would better left as it was (funny replacement for /etc/mtab) and there should be something along the lines of /proc/namespace (hell knows, we might make it compatible with /proc/ns from new Plan 9). That something most definitely doesn't need to share the format with /proc/mounts...
Re: FS_SINGLE queries
In article [EMAIL PROTECTED]> you wrote: > On Sat, 10 Jun 2000, Richard Gooch wrote: > >> Hi, all. I've just been looking at the FS_SINGLE implementation, and >> have a few comments: >> >> - although not documented, you need to do kern_mount() before trying > Yup. >> normal mounts of a FS_SINGLE; perhaps kern_mount()/kern_umount() >> should be called automatically in >> register_filesystem()/unregister_filesystem()? > > I don't think so. They are different operations and I'm not too happy > about mixing them together. Matter of taste, but... > >> - I note that procfs and pipefs call unregister_filesystem() before >> calling kern_umount(). This looks counter-intuitive, even if it's >> correct (is it?) > > It is. Look: first you take it out of reach so that nobody would mount us > while we are doing kern_umount(), then you kill the tree. This is more of an argument to combine the two operations, imho. Is there any point in having a FS_SINGLE filesystem registered, but not kern_mounted? Ion -- It is better to keep your mouth shut and be thought a fool, than to open it and remove all doubt.
Re: FS_SINGLE queries
Alexander Viro writes: > > > On Sat, 10 Jun 2000, Richard Gooch wrote: > > > Hi, all. I've just been looking at the FS_SINGLE implementation, and > > have a few comments: > > > > - although not documented, you need to do kern_mount() before trying > Yup. > > normal mounts of a FS_SINGLE; perhaps kern_mount()/kern_umount() > > should be called automatically in > > register_filesystem()/unregister_filesystem()? > > I don't think so. They are different operations and I'm not too happy > about mixing them together. Matter of taste, but... Yeah, I know. Having it documented would satisfy me. Getting a kernel BUG after adding FS_SINGLE was a shock: "what the %@$& ?!?". > > - I note that procfs and pipefs call unregister_filesystem() before > > calling kern_umount(). This looks counter-intuitive, even if it's > > correct (is it?) > > It is. Look: first you take it out of reach so that nobody would > mount us while we are doing kern_umount(), then you kill the tree. I suspected it was about race prevention. Again, if it was documented, it would be fine. When I first saw the procfs/pipefs code, I was left wondering if it was safe to unregister before unmounting. > > - when mounting a FS which is FS_SINGLE, /proc/mounts reports the FS > > type rather than "bind", which also seems wrong. > > Why? Bind is _not_ a filesystem type. From the kernel point of view > old and new instances after binding are identical - there is no > asymmetry. I see your point. However, that suggests that the naming of /proc/mounts is wrong. Perhaps we should have a /proc/namespace that shows all these VFS bindings, and separately a list of real mounts. I have the feeling we're mixing two different pieces of information into /proc/mounts at the moment. Regards, Richard Permanent: [EMAIL PROTECTED] Current: [EMAIL PROTECTED]
Re: FS_SINGLE queries
On Sat, 10 Jun 2000, Richard Gooch wrote: > Hi, all. I've just been looking at the FS_SINGLE implementation, and > have a few comments: > > - although not documented, you need to do kern_mount() before trying Yup. > normal mounts of a FS_SINGLE; perhaps kern_mount()/kern_umount() > should be called automatically in > register_filesystem()/unregister_filesystem()? I don't think so. They are different operations and I'm not too happy about mixing them together. Matter of taste, but... > - I note that procfs and pipefs call unregister_filesystem() before > calling kern_umount(). This looks counter-intuitive, even if it's > correct (is it?) It is. Look: first you take it out of reach so that nobody would mount us while we are doing kern_umount(), then you kill the tree. > - when mounting a FS which is FS_SINGLE, /proc/mounts reports the FS > type rather than "bind", which also seems wrong. Why? Bind is _not_ a filesystem type. From the kernel point of view old and new instances after binding are identical - there is no asymmetry.
FS_SINGLE queries
Hi, all. I've just been looking at the FS_SINGLE implementation, and have a few comments: - although not documented, you need to do kern_mount() before trying normal mounts of a FS_SINGLE; perhaps kern_mount()/kern_umount() should be called automatically in register_filesystem()/unregister_filesystem()? - I note that procfs and pipefs call unregister_filesystem() before calling kern_umount(). This looks counter-intuitive, even if it's correct (is it?) - when mounting a FS which is FS_SINGLE, /proc/mounts reports the FS type rather than "bind", which also seems wrong. Regards, Richard Permanent: [EMAIL PROTECTED] Current: [EMAIL PROTECTED]