Re: FS_SINGLE queries

2000-06-29 Thread Richard Gooch

Tigran Aivazian writes:
> On Thu, 29 Jun 2000, Richard Gooch wrote:
> > What happens when you try (user-space) mounting a FS_SINGLE filesystem
> > without calling kern_mount() first?
> 
> I get an oops at the line of code I mentioned - it wasn't a theoretical
> oops but a practical one :)
> 
> Basically, I was documenting file_system_type structure and wrote:
> 
> read_super - a pointer to the function that reads the super
>   block during mount operation. This function is required - if it is
> not
>   provided, mount operation (whether from userspace or inkernel) will
>   always fail except in FS_SINGLE case where it will Oops in
>   get_sb_single() trying to dereference a NULL pointer in
>   fs_type->kern_mnt->mnt_sb with (fs_type->kern_mnt = NULL) if the
>   module did not call kern_mount() in the initialisation routine after
>   filesystem was successfully registered by register_filesystem().
> 
> Now, it would sound much nicer if I could just say instead:
> 
> 
> read_super - a pointer to the function that reads the super
>   block during mount operation. This function is required - if it is
>   not provided, mount operation (whether from userspace or 
>   inkernel) will always fail.
> 
> Don't you agree? So, to test it I wrote a trivial filesystem that
> sets FS_SINGLE and yet provides no read_super in fs_type and
> discovered the oops. Then I added a dummy (always failing)
> read_super() and it oops'd exactly the same - so it doesn't matter
> if read_super is NULL or not for this thing (which reading
> read_super() function suggests anyway).

Hm. I agree that the documentation for read_super() would best not
have to mention FS_SINGLE/kern_mount() issues.

However, these issues should be discussed. I think you just need to
document that at a higher level. That will be "clean".

Regards,

Richard
Permanent: [EMAIL PROTECTED]
Current:   [EMAIL PROTECTED]



Re: FS_SINGLE queries

2000-06-29 Thread Tigran Aivazian

On Thu, 29 Jun 2000, Richard Gooch wrote:
> What happens when you try (user-space) mounting a FS_SINGLE filesystem
> without calling kern_mount() first?

I get an oops at the line of code I mentioned - it wasn't a theoretical
oops but a practical one :)

Basically, I was documenting file_system_type structure and wrote:

read_super - a pointer to the function that reads the super
  block during mount operation. This function is required - if it is
not
  provided, mount operation (whether from userspace or inkernel) will
  always fail except in FS_SINGLE case where it will Oops in
  get_sb_single() trying to dereference a NULL pointer in
  fs_type->kern_mnt->mnt_sb with (fs_type->kern_mnt = NULL) if the
  module did not call kern_mount() in the initialisation routine after
  filesystem was successfully registered by register_filesystem().

Now, it would sound much nicer if I could just say instead:


read_super - a pointer to the function that reads the super
  block during mount operation. This function is required - if it is
  not provided, mount operation (whether from userspace or 
  inkernel) will always fail.

Don't you agree? So, to test it I wrote a trivial filesystem that sets
FS_SINGLE and yet provides no read_super in fs_type and discovered the
oops. Then I added a dummy (always failing) read_super() and it oops'd
exactly the same - so it doesn't matter if read_super is NULL or not for
this thing (which reading read_super() function suggests anyway).

Regards,
Tigran





Re: FS_SINGLE queries

2000-06-29 Thread Richard Gooch

Tigran Aivazian writes:
> On Thu, 29 Jun 2000, Richard Gooch wrote:
> > Hm. Digging back into my archives, I see I said I got a kernel BUG. So
> > that means I got a BUG, not an Oops. Perhaps that means that *fs_type
> > hasn't been initialised to 0, or perhaps that fs_type->kern_mnt gets
> > initialised elsewhere even when kern_mount() isn't called (and perhaps
> > kern_mount() just initialises fs_type->kern_mnt->mnt_sb).
> > Speculations only: I haven't RTFS.
> 
> all I am really saying is that this simple filesystem should generate a
> BUG() (pointing to the fact that it should be kern_mount-ed first) and not
> an oops:

I agree one should get a BUG. And that's what I got. So although the
piece of code you looked at suggests you'd get an Oops instead, I
suspect there more be something more subtle happening.

What happens when you try (user-space) mounting a FS_SINGLE filesystem
without calling kern_mount() first?

Regards,

Richard
Permanent: [EMAIL PROTECTED]
Current:   [EMAIL PROTECTED]



Re: FS_SINGLE queries

2000-06-29 Thread Tigran Aivazian

On Thu, 29 Jun 2000, Richard Gooch wrote:

> Tigran Aivazian writes:
> > On Sat, 10 Jun 2000, Alexander Viro wrote:
> > > > - although not documented, you need to do kern_mount() before trying
> > >   Yup.
> > > >   normal mounts of a FS_SINGLE; perhaps kern_mount()/kern_umount()
> > > >   should be called automatically in
> > > >   register_filesystem()/unregister_filesystem()?
> > > 
> > > I don't think so. They are different operations and I'm not too happy
> > > about mixing them together. Matter of taste, but...
> > 
> > In get_sb_single() you wrote:
> > 
> > sb = fs_type->kern_mnt->mnt_sb;
> > if (!sb)
> > BUG();
> > 
> > and it is kern_mount() that initialises type->kern_mnt. So, if one forgot
> > to kern_mount a FS_SINGLE filesystem prior to letting userspace try to
> > mount(2) it, then it is not the BUG() that we hit but an oops of this
> > kind:
> > 
> > Code;  c013c6b1<=
> >0:   8b 58 1c  mov0x1c(%eax),%ebx   <=
> > 
> > (0x1c being offset of mnt_sb in vfsmount)
> > 
> > i.e. maybe we should really have in get_sb_single():
> > 
> > if (!fs_type->kern_mnt || !(sb = fs_type->kern_mnt->mnt_sb))
> > BUG();
> > 
> > I.e. if one forgot to kern_mount then fs_type->kern_mnt will be probably
> > left at NULL so one is more likely to follow a NULL pointer via ->kern_mnt
> > rather that follow somewhere valid and then find NULL via ->mnt_sb?
> > 
> > Richard, how is it that you actually hit the BUG() above?
> 
> Hm. Digging back into my archives, I see I said I got a kernel BUG. So
> that means I got a BUG, not an Oops. Perhaps that means that *fs_type
> hasn't been initialised to 0, or perhaps that fs_type->kern_mnt gets
> initialised elsewhere even when kern_mount() isn't called (and perhaps
> kern_mount() just initialises fs_type->kern_mnt->mnt_sb).
> Speculations only: I haven't RTFS.


all I am really saying is that this simple filesystem should generate a
BUG() (pointing to the fact that it should be kern_mount-ed first) and not
an oops:

static DECLARE_FSTYPE(single_fs_type, "single", NULL, FS_SINGLE);
static int __init init_single_fs(void)
{
return register_filesystem(&single_fs_type);
}
static void __exit exit_single_fs(void)
{
unregister_filesystem(&single_fs_type);
}

Regards,
Tigran




Re: FS_SINGLE queries

2000-06-29 Thread Richard Gooch

Tigran Aivazian writes:
> On Sat, 10 Jun 2000, Alexander Viro wrote:
> > > - although not documented, you need to do kern_mount() before trying
> > Yup.
> > >   normal mounts of a FS_SINGLE; perhaps kern_mount()/kern_umount()
> > >   should be called automatically in
> > >   register_filesystem()/unregister_filesystem()?
> > 
> > I don't think so. They are different operations and I'm not too happy
> > about mixing them together. Matter of taste, but...
> 
> In get_sb_single() you wrote:
> 
> sb = fs_type->kern_mnt->mnt_sb;
> if (!sb)
> BUG();
> 
> and it is kern_mount() that initialises type->kern_mnt. So, if one forgot
> to kern_mount a FS_SINGLE filesystem prior to letting userspace try to
> mount(2) it, then it is not the BUG() that we hit but an oops of this
> kind:
> 
> Code;  c013c6b1<=
>0:   8b 58 1c  mov0x1c(%eax),%ebx   <=
> 
> (0x1c being offset of mnt_sb in vfsmount)
> 
> i.e. maybe we should really have in get_sb_single():
> 
> if (!fs_type->kern_mnt || !(sb = fs_type->kern_mnt->mnt_sb))
> BUG();
> 
> I.e. if one forgot to kern_mount then fs_type->kern_mnt will be probably
> left at NULL so one is more likely to follow a NULL pointer via ->kern_mnt
> rather that follow somewhere valid and then find NULL via ->mnt_sb?
> 
> Richard, how is it that you actually hit the BUG() above?

Hm. Digging back into my archives, I see I said I got a kernel BUG. So
that means I got a BUG, not an Oops. Perhaps that means that *fs_type
hasn't been initialised to 0, or perhaps that fs_type->kern_mnt gets
initialised elsewhere even when kern_mount() isn't called (and perhaps
kern_mount() just initialises fs_type->kern_mnt->mnt_sb).
Speculations only: I haven't RTFS.

Regards,

Richard
Permanent: [EMAIL PROTECTED]
Current:   [EMAIL PROTECTED]



Re: FS_SINGLE queries

2000-06-29 Thread Tigran Aivazian

On Sat, 10 Jun 2000, Alexander Viro wrote:
> > - although not documented, you need to do kern_mount() before trying
>   Yup.
> >   normal mounts of a FS_SINGLE; perhaps kern_mount()/kern_umount()
> >   should be called automatically in
> >   register_filesystem()/unregister_filesystem()?
> 
> I don't think so. They are different operations and I'm not too happy
> about mixing them together. Matter of taste, but...

In get_sb_single() you wrote:

sb = fs_type->kern_mnt->mnt_sb;
if (!sb)
BUG();

and it is kern_mount() that initialises type->kern_mnt. So, if one forgot
to kern_mount a FS_SINGLE filesystem prior to letting userspace try to
mount(2) it, then it is not the BUG() that we hit but an oops of this
kind:

Code;  c013c6b1<=
   0:   8b 58 1c  mov0x1c(%eax),%ebx   <=

(0x1c being offset of mnt_sb in vfsmount)

i.e. maybe we should really have in get_sb_single():

if (!fs_type->kern_mnt || !(sb = fs_type->kern_mnt->mnt_sb))
BUG();

I.e. if one forgot to kern_mount then fs_type->kern_mnt will be probably
left at NULL so one is more likely to follow a NULL pointer via ->kern_mnt
rather that follow somewhere valid and then find NULL via ->mnt_sb?

Richard, how is it that you actually hit the BUG() above?

Regards,
Tigran





Re: FS_SINGLE queries

2000-06-16 Thread Alexander Viro



On Fri, 16 Jun 2000, Richard Gooch wrote:

> - mount procfs on boot :->
> 
> - modify init(8) to not need /dev/tty (which would be a pity, because
>   session management before starting boot scripts is damn handy)
>
> - put all the virtual filesystems under a (known, fixed) kernel
>   namespace and allow a way to access that namespace from user-space
>   (alternative to mounting procfs at boot, so that we don't get
>   bleatings about "policy").
> 
> 

Tsk, tsk... Too obvious one - consider taking lessons from Albert...




Re: FS_SINGLE queries

2000-06-16 Thread Richard Gooch

Alexander Viro writes:
> On Fri, 16 Jun 2000, Richard Gooch wrote:
> 
> > - mount procfs on boot :->
> > 
> > - modify init(8) to not need /dev/tty (which would be a pity, because
> >   session management before starting boot scripts is damn handy)
> >
> > - put all the virtual filesystems under a (known, fixed) kernel
> >   namespace and allow a way to access that namespace from user-space
> >   (alternative to mounting procfs at boot, so that we don't get
> >   bleatings about "policy").
> > 
> > 
> 
> Tsk, tsk... Too obvious one - consider taking lessons from Albert...

Bugger. That was no fun at all.

Regards,

Richard
Permanent: [EMAIL PROTECTED]
Current:   [EMAIL PROTECTED]



Re: FS_SINGLE queries

2000-06-16 Thread Richard Gooch

Alexander Viro writes:
> On Fri, 16 Jun 2000, Richard Gooch wrote:
> > Agreed. /dev/tty always struck me as a bit evil^Wmagic. At the very
> > least, a symlink to /proc/self/tty would make it pretty damn clear
> > even to a novice.
> 
> Unfortunately, unlike /proc/mounts, /dev/tty has to be avalaible
> before mounting procfs. Alas ;-<

Some solutions:

- mount procfs on boot :->

- modify init(8) to not need /dev/tty (which would be a pity, because
  session management before starting boot scripts is damn handy)

- put all the virtual filesystems under a (known, fixed) kernel
  namespace and allow a way to access that namespace from user-space
  (alternative to mounting procfs at boot, so that we don't get
  bleatings about "policy").



Regards,

Richard
Permanent: [EMAIL PROTECTED]
Current:   [EMAIL PROTECTED]



Re: FS_SINGLE queries

2000-06-16 Thread Alexander Viro



On Fri, 16 Jun 2000, Richard Gooch wrote:

> Alexander Viro writes:
> > 
> > 
> > On Fri, 16 Jun 2000, Erez Zadok wrote:
> > 
> > > Hey, we can make it yet another ioctl(2).  Then we can trade a crapload of
> > > syscalls for a crapload of ioctls --- a time-honored Unix tradition... :-)
> > > :-)
> > > 
> > > Seriously, an open/read/.../close would work fine, but on what file?  If
> > > it's something inside /proc, fine, but has the Linux community as a whole
> > > accepted that procfs is a *must* for any working system "or else"?  If the
> > > file to open/read/close won't be in /proc, what type of file it'd be and how
> > > it'd get created?
> > 
> > Depends. If we have per-process namespaces - procfs is the only way
> > to go, simply because there is no such thing as system-wide set of
> > mounts.  However, that procfs will not have to contain anything but
> > per-process data + /proc/self. Another variant is a mechanism a-la
> > /dev/tty, but frankly, I would rather see /dev/tty being a symlink
> > to /proc/self/tty...
> 
> Agreed. /dev/tty always struck me as a bit evil^Wmagic. At the very
> least, a symlink to /proc/self/tty would make it pretty damn clear
> even to a novice.

Unfortunately, unlike /proc/mounts, /dev/tty has to be avalaible before
mounting procfs. Alas ;-<




Re: FS_SINGLE queries

2000-06-16 Thread Richard Gooch

Alexander Viro writes:
> 
> 
> On Fri, 16 Jun 2000, Erez Zadok wrote:
> 
> > Hey, we can make it yet another ioctl(2).  Then we can trade a crapload of
> > syscalls for a crapload of ioctls --- a time-honored Unix tradition... :-)
> > :-)
> > 
> > Seriously, an open/read/.../close would work fine, but on what file?  If
> > it's something inside /proc, fine, but has the Linux community as a whole
> > accepted that procfs is a *must* for any working system "or else"?  If the
> > file to open/read/close won't be in /proc, what type of file it'd be and how
> > it'd get created?
> 
> Depends. If we have per-process namespaces - procfs is the only way
> to go, simply because there is no such thing as system-wide set of
> mounts.  However, that procfs will not have to contain anything but
> per-process data + /proc/self. Another variant is a mechanism a-la
> /dev/tty, but frankly, I would rather see /dev/tty being a symlink
> to /proc/self/tty...

Agreed. /dev/tty always struck me as a bit evil^Wmagic. At the very
least, a symlink to /proc/self/tty would make it pretty damn clear
even to a novice.

Regards,

Richard
Permanent: [EMAIL PROTECTED]
Current:   [EMAIL PROTECTED]



Re: FS_SINGLE queries

2000-06-16 Thread Alexander Viro



On Fri, 16 Jun 2000, Erez Zadok wrote:

> Hey, we can make it yet another ioctl(2).  Then we can trade a crapload of
> syscalls for a crapload of ioctls --- a time-honored Unix tradition... :-)
> :-)
> 
> Seriously, an open/read/.../close would work fine, but on what file?  If
> it's something inside /proc, fine, but has the Linux community as a whole
> accepted that procfs is a *must* for any working system "or else"?  If the
> file to open/read/close won't be in /proc, what type of file it'd be and how
> it'd get created?

Depends. If we have per-process namespaces - procfs is the only way to go,
simply because there is no such thing as system-wide set of mounts.
However, that procfs will not have to contain anything but per-process
data + /proc/self. Another variant is a mechanism a-la /dev/tty, but
frankly, I would rather see /dev/tty being a symlink to /proc/self/tty...




Re: FS_SINGLE queries

2000-06-16 Thread Erez Zadok

In message <[EMAIL PROTECTED]>, Alexander Viro 
writes:
> 
> 
> On Fri, 16 Jun 2000, Erez Zadok wrote:
> 
[...]
> > Anyway, I'd like to see a new syscall that returns a list of mounts and
> 
> Sigh... We already have a crapload of syscalls that should not be there.
> If it can be done by open()/read()/write()/lseek()/close() it should be
> done that way.

Hey, we can make it yet another ioctl(2).  Then we can trade a crapload of
syscalls for a crapload of ioctls --- a time-honored Unix tradition... :-)
:-)

Seriously, an open/read/.../close would work fine, but on what file?  If
it's something inside /proc, fine, but has the Linux community as a whole
accepted that procfs is a *must* for any working system "or else"?  If the
file to open/read/close won't be in /proc, what type of file it'd be and how
it'd get created?

Erez.



Re: FS_SINGLE queries

2000-06-16 Thread Alexander Viro



On Fri, 16 Jun 2000, Erez Zadok wrote:

> > I'm not sure that we need to keep it on procfs - especially with the
> > union-mounts coming into the game.
> 
> Procfs or not, I'm advocating for keeping it in the kernel only, where it
> belongs, and removing the kludgy need (ala Sun and many others) to maintain
> a separate /etc/mtab file.

Oh, definitely.

> Anyway, I'd like to see a new syscall that returns a list of mounts and

Sigh... We already have a crapload of syscalls that should not be there.
If it can be done by open()/read()/write()/lseek()/close() it should be
done that way.

> associated info in linux.  Currently that can be done by reading
> /proc/mounts, but not if procfs isn't available or we're going to take
> /proc/mounts away.

/proc//ns. And we can't remove /proc/mount (albeit we can take all
crap out of procfs and union-mount it over the thing).




Re: FS_SINGLE queries

2000-06-16 Thread Erez Zadok

In message <[EMAIL PROTECTED]>, [EMAIL PROTECTED] 
writes:
[...]
> so mount could keep a /etc/mtab2 to record this informatoin, but that's
> freaking ugly.  or we could pass a new mount option down into the kernel
> which causes it to display `loop' in that entry, bu this seems like a
> waste of a bit.  other alternatives gladly sought.

Not necessarily.  Several OSs use an "ignore" bit as a mount flag telling
programs like df(1) not to stat certain entries by default.  This is often
used for automounted/autofs entries, where normally no reasonable info can
be returned to statvfs(2), plus it's a good idea not to slow df(1) by
stating file systems that may be served by slow user-level file servers (amd
w/o autofs support).  There are cases where such file servers can return
useful info back to statvfs(2) (as amd can).

BTW, the usual reason you don't see such automounted entries is that GNU df
automatically will not list entries with statfs values of 0, but it still
will statfs(2) them which will be slow (and hand if the automounter is
hung).  It's much better if the kernel can record that certain entries were
mounted w/ the "ignore" option, and ensure that df(1) simply doesn't
statfs's 'em at all.

Erez.



Re: FS_SINGLE queries

2000-06-16 Thread willy

On Fri, Jun 16, 2000 at 02:15:32PM +0100, Tigran Aivazian wrote:
> while we are on the subject of obsoleting /etc/mtab in favour of
> /proc/mounts (or enhanced version thereof), please keep in mind the
> classical problem of "mount -o loop" stopping to work if you use
> /proc/mounts. I think Andries Brouwer explained in the past why was that
> the case but I can't remember the reason. The problem was that loopback
> module's refcount will leak (up) when using /proc/mounts unless you
> manually losetup -d appropriate number of times.

oh, the reason is pretty straightforward: mount(8) writes `loop' into the
mount options if /etc/mtab isn't a symlink to /proc/mounts.  if it is,
it can't modify /proc/mounts.  umount(8) will automatically delete the
underlying loop device if the `loop' keyword is there.

i suspect it doesn't autodelete the device based on the device matching
/dev/lo* because the user may have set it up by hand and be very unhappy
about mount trying to be too clever and delete it.

so mount could keep a /etc/mtab2 to record this informatoin, but that's
freaking ugly.  or we could pass a new mount option down into the kernel
which causes it to display `loop' in that entry, bu this seems like a
waste of a bit.  other alternatives gladly sought.

something i hope to get time to look at next week is redoing the loop
device to take advantage of the LVM hooks.  it might be significantly
faster than the current system.

-- 
The Sex Pistols were revolutionaries.  The Bay City Rollers weren't.



Re: FS_SINGLE queries

2000-06-16 Thread Tigran Aivazian

On Fri, 16 Jun 2000, Alexander Viro wrote:
> On Fri, 16 Jun 2000, Erez Zadok wrote:
> 
> > On a related note, since we do have /proc/mounts, and assuming that procfs
> > is pretty much necessary nowadays, are we going to get rid of /etc/mtab and
> > completely move all getmntent info into the kernel?  I never liked the fact
> > that people doing mounts (such as automounters) have to ensure that they
> > correctly maintain a separate text file in /etc. 
> 
> I'm not sure that we need to keep it on procfs - especially with the
> union-mounts coming into the game.
> 
...
> > Hmmm, maybe that's a question to the glibc folks.  I guess as long as all
> > the necessary tools and libraries will use /proc/mounts if available, and
> > avoid using /etc/mtab, that'd be ok.
> 
>   How many programs actually need this getmntent(), in the first
> place?
> 

while we are on the subject of obsoleting /etc/mtab in favour of
/proc/mounts (or enhanced version thereof), please keep in mind the
classical problem of "mount -o loop" stopping to work if you use
/proc/mounts. I think Andries Brouwer explained in the past why was that
the case but I can't remember the reason. The problem was that loopback
module's refcount will leak (up) when using /proc/mounts unless you
manually losetup -d appropriate number of times.

Regardsm
Tigran






Re: FS_SINGLE queries

2000-06-16 Thread Erez Zadok

In message <[EMAIL PROTECTED]>, Alexander Viro 
writes:
> 
> 
> On Fri, 16 Jun 2000, Erez Zadok wrote:
> 
> > On a related note, since we do have /proc/mounts, and assuming that procfs
> > is pretty much necessary nowadays, are we going to get rid of /etc/mtab and
> > completely move all getmntent info into the kernel?  I never liked the fact
> > that people doing mounts (such as automounters) have to ensure that they
> > correctly maintain a separate text file in /etc. 
> 
> I'm not sure that we need to keep it on procfs - especially with the
> union-mounts coming into the game.

Procfs or not, I'm advocating for keeping it in the kernel only, where it
belongs, and removing the kludgy need (ala Sun and many others) to maintain
a separate /etc/mtab file.

> > Hmmm, maybe that's a question to the glibc folks.  I guess as long as all
> > the necessary tools and libraries will use /proc/mounts if available, and
> > avoid using /etc/mtab, that'd be ok.
> 
>   How many programs actually need this getmntent(), in the first
> place?

Programs like df(1) need to read mtab.  Automounters (such as amd, which I
maintain) and /bin/mount need to write it.  The problem with a separate mtab
file is that there's no way to guarantee that the file in /etc is in sync w/
the actual mounts in the kernel.  There are many reasons why you can get an
mtab file that's out of sync w/ the actual in-kernel mounts.  AIX, Ultrix,
and BSD44 did the right thing by moving this mtab list into the kernel, and
rewriting "[gs]etmntent" (they also renamed them) so they query the kernel
via a syscall.  Solaris 8 move that way too, but kept backwards
compatibility using their special mntfs.

Anyway, I'd like to see a new syscall that returns a list of mounts and
associated info in linux.  Currently that can be done by reading
/proc/mounts, but not if procfs isn't available or we're going to take
/proc/mounts away.  It would make programs like df more reliable, and
programs like /bin/mount won't have to rewrite the mtab file each time a
mount(2) is made.  And it'll make amd work a little faster (I already
auto-detect in-kernel vs. in-/etc mount tables and handle that in amd).

Anyway it's not a big thing or something that we need to do right now.

Erez.



Re: FS_SINGLE queries

2000-06-16 Thread Alexander Viro



On Fri, 16 Jun 2000, Erez Zadok wrote:

> On a related note, since we do have /proc/mounts, and assuming that procfs
> is pretty much necessary nowadays, are we going to get rid of /etc/mtab and
> completely move all getmntent info into the kernel?  I never liked the fact
> that people doing mounts (such as automounters) have to ensure that they
> correctly maintain a separate text file in /etc. 

I'm not sure that we need to keep it on procfs - especially with the
union-mounts coming into the game.

> If we want to go crazy, we can implement mntfs ala Solaris 8, which moved
> the mnt info into the kernel, but allowed for "editing" /etc/mnttab which is
> now a special f/s mounted on top of a single file.

I'ld rather not do it. I know that I'm biased, but Sun seems to be
seriously bound on turning their system into a tasteless pile of kludges.
I wouldn't trust SunSoft to design a loo cover - it would weight a ton,
come with HTML manual, require a complex system of levers to raise and
have a nasty habit of coming down _hard_ in the most inconvenient moments.
Sorry - still bitter over their switch from SunOS 4 to Slowlartus...

> Hmmm, maybe that's a question to the glibc folks.  I guess as long as all
> the necessary tools and libraries will use /proc/mounts if available, and
> avoid using /etc/mtab, that'd be ok.

How many programs actually need this getmntent(), in the first
place?




Re: FS_SINGLE queries

2000-06-16 Thread Erez Zadok

In message <[EMAIL PROTECTED]>, Alexander 
Viro writes:
> 
> 
> On Sat, 10 Jun 2000, Richard Gooch wrote:
> 
> > I see your point. However, that suggests that the naming of
> > /proc/mounts is wrong. Perhaps we should have a /proc/namespace that
> > shows all these VFS bindings, and separately a list of real mounts.
> 
> What's "real"? /proc/mounts would better left as it was (funny replacement
> for /etc/mtab) and there should be something along the lines of
> /proc/namespace (hell knows, we might make it compatible with /proc/ns
> from new Plan 9). That something most definitely doesn't need to share the
> format with /proc/mounts...

On a related note, since we do have /proc/mounts, and assuming that procfs
is pretty much necessary nowadays, are we going to get rid of /etc/mtab and
completely move all getmntent info into the kernel?  I never liked the fact
that people doing mounts (such as automounters) have to ensure that they
correctly maintain a separate text file in /etc. 

If we want to go crazy, we can implement mntfs ala Solaris 8, which moved
the mnt info into the kernel, but allowed for "editing" /etc/mnttab which is
now a special f/s mounted on top of a single file.

Hmmm, maybe that's a question to the glibc folks.  I guess as long as all
the necessary tools and libraries will use /proc/mounts if available, and
avoid using /etc/mtab, that'd be ok.

Erez.



Re: FS_SINGLE queries

2000-06-10 Thread Richard Gooch

Alexander Viro writes:
> On Sat, 10 Jun 2000, Richard Gooch wrote:
> > Will it really make much difference? What would be harder to do
> > without mount IDs? And how much harder?
> 
> Beware of functions with many arguments... Besides, what about "kill
> the component of union-mount on /barf NFS-mounted from
> venus:/foo/bar"?  What exactly are you going to pass here? Such
> stuff is better left to userland.

Let's see. Pass the same stuff you see in /proc/namespace? Instead of
cut-and-paste of the mount ID, cut-and-paste the other entries on that
line. You're making the decision based on what's in /proc/namespace
anyway. Why add another level of indirection?

> > > And then... consider the situation when root logs in and decides to
> > > mess with luser's namespace.
> > 
> > What about it?
> 
> bastard@venus% su -
> Password:
> root@venus% w luser
> 
> luser pts/0   ...
> root@venus% ps t pts/0
> 
> 
> 728 pts/0 ...
> root@venus% cat /proc/728/ns
> 
> 
> 123749/home/luser/foo /   nfs 
> root@venus% umount -I 123749
> root@venus% logout
> bastard@venus% mail luser
> Subject: you've been told to umount ~/foo
> 
> ^D
> 
> > Avoiding numbers is a good thing. They have no intrinsic meaning.
> 
> Tell that to guys who invented file descriptors. IMO that works
> quite fine - I'll prefer to do close(17) rather than  incantations of horror needed on OS/360> type that stuff late on Saturday>

But FDs are different. You reference the file by name, and the system
returns an opaque handle. But the system isn't returning a mount ID as
the result of some operation: you had to scan a file for it. So really
your handle to the mount is something else, the mount ID is merely
another level of indirection.

Regards,

Richard
Permanent: [EMAIL PROTECTED]
Current:   [EMAIL PROTECTED]



Re: FS_SINGLE queries

2000-06-10 Thread Alexander Viro



On Sat, 10 Jun 2000, Richard Gooch wrote:

> Will it really make much difference? What would be harder to do
> without mount IDs? And how much harder?

Beware of functions with many arguments... Besides, what about "kill
the component of union-mount on /barf NFS-mounted from venus:/foo/bar"?
What exactly are you going to pass here? Such stuff is better left to
userland.

> > And then... consider the situation when root logs in and decides to
> > mess with luser's namespace.
> 
> What about it?

bastard@venus% su -
Password:
root@venus% w luser

luser   pts/0   ...
root@venus% ps t pts/0


728 pts/0 ...
root@venus% cat /proc/728/ns


123749  /home/luser/foo /   nfs 
root@venus% umount -I 123749
root@venus% logout
bastard@venus% mail luser
Subject: you've been told to umount ~/foo

^D

> Avoiding numbers is a good thing. They have no intrinsic meaning.

Tell that to guys who invented file descriptors. IMO that works quite
fine - I'll prefer to do close(17) rather than 




Re: FS_SINGLE queries

2000-06-10 Thread Richard Gooch

Alexander Viro writes:
> 
> 
> On Sat, 10 Jun 2000, Richard Gooch wrote:
> 
> > Yeah, sure. I did say "for example". Your format looks fine. One
> > question: is the mount ID really needed? Can't you distinguish based
> > on what FS you're mounting (and mountpoint root)?
> 
> First of all, interface is simpler that way.

Will it really make much difference? What would be harder to do
without mount IDs? And how much harder?

> And then... consider the situation when root logs in and decides to
> mess with luser's namespace.

What about it?

Avoiding numbers is a good thing. They have no intrinsic meaning.

Regards,

Richard
Permanent: [EMAIL PROTECTED]
Current:   [EMAIL PROTECTED]



Re: FS_SINGLE queries

2000-06-10 Thread Alexander Viro



On Sat, 10 Jun 2000, Richard Gooch wrote:

> Yeah, sure. I did say "for example". Your format looks fine. One
> question: is the mount ID really needed? Can't you distinguish based
> on what FS you're mounting (and mountpoint root)?

First of all, interface is simpler that way. And then... consider the
situation when root logs in and decides to mess with luser's namespace.




Re: FS_SINGLE queries

2000-06-10 Thread Richard Gooch

Alexander Viro writes:
> 
> 
> On Sat, 10 Jun 2000, Richard Gooch wrote:
> 
> > What I mean by "real" mounts is a table that shows how each FS was
> > brought into the namespace (or each namespace, once you implement
> > CLONE_NEWNS). So for example:
> > #device filesystem  roots
> > /dev/hda1   ext2/
> > /dev/hda2   ext2/var/spool/mail /gaol/var/spool/mail
> > noneproc/proc /gaol/proc
> 
> Bad format. If anything, it should contain mount IDs (if you want to have
> union-mount you need those, just to be able to take away components).
> The following might go:
> 
> 1 /   /   ext2/dev/hda1
> 2 /var/spool/mail /   ext2/dev/hda2
> 3 /proc   /   procfs
> 14/gaol/var/spool/mail/   ext2/dev/hda2
> 15/gaol/proc  /   procfs
> 42/gaol/lib/libc.2.1.3.so /lib/libc.2.1.3.so  ext2/dev/hda1
> ...
> 
> IOW, ID + mountpoint + location of root in its tree + fs type +
> fs-specific parameters. That at least allows to reproduce the
> namespace. And yes, IMO "device" is fs-specific parameter.

Yeah, sure. I did say "for example". Your format looks fine. One
question: is the mount ID really needed? Can't you distinguish based
on what FS you're mounting (and mountpoint root)?

BTW: I agree that device is fs-specific. It's much nicer to see "none"
done away with.

Regards,

Richard
Permanent: [EMAIL PROTECTED]
Current:   [EMAIL PROTECTED]



Re: FS_SINGLE queries

2000-06-10 Thread Alexander Viro



On Sat, 10 Jun 2000, Richard Gooch wrote:

> What I mean by "real" mounts is a table that shows how each FS was
> brought into the namespace (or each namespace, once you implement
> CLONE_NEWNS). So for example:
> #device   filesystem  roots
> /dev/hda1 ext2/
> /dev/hda2 ext2/var/spool/mail /gaol/var/spool/mail
> none  proc/proc /gaol/proc

Bad format. If anything, it should contain mount IDs (if you want to have
union-mount you need those, just to be able to take away components).
The following might go:

1   /   /   ext2/dev/hda1
2   /var/spool/mail /   ext2/dev/hda2
3   /proc   /   procfs
14  /gaol/var/spool/mail/   ext2/dev/hda2
15  /gaol/proc  /   procfs
42  /gaol/lib/libc.2.1.3.so /lib/libc.2.1.3.so  ext2/dev/hda1
...

IOW, ID + mountpoint + location of root in its tree + fs type + fs-specific
parameters. That at least allows to reproduce the namespace. And yes, IMO
"device" is fs-specific parameter.




Re: FS_SINGLE queries

2000-06-10 Thread Richard Gooch

Alexander Viro writes:
> 
> 
> On Sat, 10 Jun 2000, Richard Gooch wrote:
> 
> > I see your point. However, that suggests that the naming of
> > /proc/mounts is wrong. Perhaps we should have a /proc/namespace that
> > shows all these VFS bindings, and separately a list of real mounts.
> 
> What's "real"? /proc/mounts would better left as it was (funny
> replacement for /etc/mtab) and there should be something along the
> lines of /proc/namespace (hell knows, we might make it compatible
> with /proc/ns from new Plan 9). That something most definitely
> doesn't need to share the format with /proc/mounts...

What I mean by "real" mounts is a table that shows how each FS was
brought into the namespace (or each namespace, once you implement
CLONE_NEWNS). So for example:
#device filesystem  roots
/dev/hda1   ext2/
/dev/hda2   ext2/var/spool/mail /gaol/var/spool/mail
noneproc/proc /gaol/proc

in /proc/namespace. And I suppose that /proc/namespace would be unique
for each namespace as well. This way, no distinction is made between
the first mount and subsequent bindings, which is what you'd like, as
I gather you'd like to make all bindings equal.

Aside: I guess the reality is that the first binding (the original
mount -t ext2) is more equal than the subsequent bindings (mount -t
bind). Evidence: the O_CREAT bug I found the other day ;-)

Regards,

Richard
Permanent: [EMAIL PROTECTED]
Current:   [EMAIL PROTECTED]



Re: FS_SINGLE queries

2000-06-10 Thread Alexander Viro



On Sat, 10 Jun 2000, Richard Gooch wrote:

> I see your point. However, that suggests that the naming of
> /proc/mounts is wrong. Perhaps we should have a /proc/namespace that
> shows all these VFS bindings, and separately a list of real mounts.

What's "real"? /proc/mounts would better left as it was (funny replacement
for /etc/mtab) and there should be something along the lines of
/proc/namespace (hell knows, we might make it compatible with /proc/ns
from new Plan 9). That something most definitely doesn't need to share the
format with /proc/mounts...




Re: FS_SINGLE queries

2000-06-10 Thread Ion Badulescu

In article [EMAIL PROTECTED]> you wrote:

> On Sat, 10 Jun 2000, Richard Gooch wrote:
> 
>>   Hi, all. I've just been looking at the FS_SINGLE implementation, and
>> have a few comments:
>> 
>> - although not documented, you need to do kern_mount() before trying
>   Yup.
>>   normal mounts of a FS_SINGLE; perhaps kern_mount()/kern_umount()
>>   should be called automatically in
>>   register_filesystem()/unregister_filesystem()?
> 
> I don't think so. They are different operations and I'm not too happy
> about mixing them together. Matter of taste, but...
> 
>> - I note that procfs and pipefs call unregister_filesystem() before
>>   calling kern_umount(). This looks counter-intuitive, even if it's
>>   correct (is it?)
> 
> It is. Look: first you take it out of reach so that nobody would mount us
> while we are doing kern_umount(), then you kill the tree.

This is more of an argument to combine the two operations, imho. Is there
any point in having a FS_SINGLE filesystem registered, but not kern_mounted?


Ion

-- 
  It is better to keep your mouth shut and be thought a fool,
than to open it and remove all doubt.



Re: FS_SINGLE queries

2000-06-10 Thread Richard Gooch

Alexander Viro writes:
> 
> 
> On Sat, 10 Jun 2000, Richard Gooch wrote:
> 
> >   Hi, all. I've just been looking at the FS_SINGLE implementation, and
> > have a few comments:
> > 
> > - although not documented, you need to do kern_mount() before trying
>   Yup.
> >   normal mounts of a FS_SINGLE; perhaps kern_mount()/kern_umount()
> >   should be called automatically in
> >   register_filesystem()/unregister_filesystem()?
> 
> I don't think so. They are different operations and I'm not too happy
> about mixing them together. Matter of taste, but...

Yeah, I know. Having it documented would satisfy me. Getting a kernel
BUG after adding FS_SINGLE was a shock: "what the %@$& ?!?".

> > - I note that procfs and pipefs call unregister_filesystem() before
> >   calling kern_umount(). This looks counter-intuitive, even if it's
> >   correct (is it?)
> 
> It is. Look: first you take it out of reach so that nobody would
> mount us while we are doing kern_umount(), then you kill the tree.

I suspected it was about race prevention. Again, if it was documented,
it would be fine. When I first saw the procfs/pipefs code, I was left
wondering if it was safe to unregister before unmounting.

> > - when mounting a FS which is FS_SINGLE, /proc/mounts reports the FS
> >   type rather than "bind", which also seems wrong.
> 
> Why? Bind is _not_ a filesystem type. From the kernel point of view
> old and new instances after binding are identical - there is no
> asymmetry.

I see your point. However, that suggests that the naming of
/proc/mounts is wrong. Perhaps we should have a /proc/namespace that
shows all these VFS bindings, and separately a list of real mounts.

I have the feeling we're mixing two different pieces of information
into /proc/mounts at the moment.

Regards,

Richard
Permanent: [EMAIL PROTECTED]
Current:   [EMAIL PROTECTED]



Re: FS_SINGLE queries

2000-06-10 Thread Alexander Viro



On Sat, 10 Jun 2000, Richard Gooch wrote:

>   Hi, all. I've just been looking at the FS_SINGLE implementation, and
> have a few comments:
> 
> - although not documented, you need to do kern_mount() before trying
Yup.
>   normal mounts of a FS_SINGLE; perhaps kern_mount()/kern_umount()
>   should be called automatically in
>   register_filesystem()/unregister_filesystem()?

I don't think so. They are different operations and I'm not too happy
about mixing them together. Matter of taste, but...

> - I note that procfs and pipefs call unregister_filesystem() before
>   calling kern_umount(). This looks counter-intuitive, even if it's
>   correct (is it?)

It is. Look: first you take it out of reach so that nobody would mount us
while we are doing kern_umount(), then you kill the tree.

> - when mounting a FS which is FS_SINGLE, /proc/mounts reports the FS
>   type rather than "bind", which also seems wrong.

Why? Bind is _not_ a filesystem type. From the kernel point of view old
and new instances after binding are identical - there is no asymmetry.




FS_SINGLE queries

2000-06-10 Thread Richard Gooch

  Hi, all. I've just been looking at the FS_SINGLE implementation, and
have a few comments:

- although not documented, you need to do kern_mount() before trying
  normal mounts of a FS_SINGLE; perhaps kern_mount()/kern_umount()
  should be called automatically in
  register_filesystem()/unregister_filesystem()?

- I note that procfs and pipefs call unregister_filesystem() before
  calling kern_umount(). This looks counter-intuitive, even if it's
  correct (is it?)

- when mounting a FS which is FS_SINGLE, /proc/mounts reports the FS
  type rather than "bind", which also seems wrong.

Regards,

Richard
Permanent: [EMAIL PROTECTED]
Current:   [EMAIL PROTECTED]