Re: For review: rewritten pivot_root(2) manual page

2019-10-10 Thread Michael Kerrisk (man-pages)
Hello Eric,

I think I just understood something. See below.

On 10/9/19 11:01 PM, Michael Kerrisk (man-pages) wrote:
> Hello Eric,
> 
> On 10/9/19 6:00 PM, Eric W. Biederman wrote:
>> "Michael Kerrisk (man-pages)"  writes:
>>
>>> Hello Eric,
>>>
>>> Thank you. I was hoping you might jump in on this thread.
>>>
>>> Please see below.
>>>
>>> On 10/9/19 10:46 AM, Eric W. Biederman wrote:
 "Michael Kerrisk (man-pages)"  writes:

> Hello Philipp,
>
> My apologies that it has taken a while to reply. (I had been hoping
> and waiting that a few more people might weigh in on this thread.)
>
> On 9/23/19 3:42 PM, Philipp Wendler wrote:
>> Hello Michael,
>>
>> Am 23.09.19 um 14:04 schrieb Michael Kerrisk (man-pages):
>>
>>> I'm considering to rewrite these pieces to exactly
>>> describe what the system call does (which I already
>>> do in the third paragraph) and remove the "may or may not"
>>> pieces in the second paragraph. I'd welcome comments
>>> on making that change.
>>>
>>> What did you think about my proposal above? To put it in context,
>>> this was my initial comment in the mail:
>>>
>>> [[
>>> One area of the page that I'm still not really happy with
>>> is the "vague" wording in the second paragraph and the note
>>> in the third paragraph about the system call possibly
>>> changing. These pieces survive (in somewhat modified form)
>>> from the original page, which was written before the
>>> system call was released, and it seems there was some
>>> question about whether the system call might still change
>>> its behavior with respect to the root directory and current
>>> working directory of other processes. However, after 19
>>> years, nothing has changed, and surely it will not in the
>>> future, since that would constitute an ABI breakage.
>>> I'm considering to rewrite these pieces to exactly
>>> describe what the system call does (which I already
>>> do in the third paragraph) and remove the "may or may not"
>>> pieces in the second paragraph. I'd welcome comments
>>> on making that change.
>>> ]]
>>>
>>> And the second and third paragraphs of the manual page currently
>>> read:
>>>
>>> [[
>>>pivot_root()  may  or may not change the current root and the cur‐
>>>rent working directory of any processes or threads  that  use  the
>>>old  root  directory  and which are in the same mount namespace as
>>>the caller of pivot_root().  The  caller  of  pivot_root()  should
>>>ensure  that  processes  with root or current working directory at
>>>the old root operate correctly in either case.   An  easy  way  to
>>>ensure  this is to change their root and current working directory
>>>to  new_root  before  invoking  pivot_root().   Note   also   that
>>>pivot_root()  may  or may not affect the calling process's current
>>>working directory.  It is therefore recommended to call chdir("/")
>>>immediately after pivot_root().
>>>
>>>The  paragraph  above  is  intentionally vague because at the time
>>>when pivot_root() was first implemented, it  was  unclear  whether
>>>its  affect  on  other process's root and current working directo‐
>>>ries—and the caller's current working  directory—might  change  in
>>>the  future.   However, the behavior has remained consistent since
>>>this system call was first implemented: pivot_root()  changes  the
>>>root  directory  and the current working directory of each process
>>>or thread in the same mount namespace to new_root if they point to
>>>the  old  root  directory.   (See also NOTES.)  On the other hand,
>>>pivot_root() does not change the caller's current  working  direc‐
>>>tory  (unless it is on the old root directory), and thus it should
>>>be followed by a chdir("/") call.
>>> ]]
>>
>> Apologies I saw that concern I didn't realize it was a questio
>>
>> I think it is very reasonable to remove warning the behavior might
>> change.  We have pivot_root(8) in common use that to use it requires
>> the semantic of changing processes other than the current process.
>> Which means any attempt to noticably change the behavior of
>> pivot_root(2) will break userspace.
> 
> Thanks for the confirmation that this change would be okay.
> I will make this change soon, unless I hear a counterargument.
> 
>> Now the documented semantics in behavior above are not quite what
>> pivot_root(2) does.  It walks all processes on the system and if the
>> working directory or the root directory refer to the root mount that is
>> being replaced, then pivot_root(2) will update them.
>>
>> In practice the above is limited to a mount namespace.  But something as
>> simple as "cd /proc//root" can allow a process to have a
>> working directory in a different mount namespace.
> 
> So, I'm not quite clear. Do you mean that something in the existing
> ma

Re: For review: rewritten pivot_root(2) manual page

2019-10-09 Thread Michael Kerrisk (man-pages)
Hello Eric,

On 10/9/19 6:00 PM, Eric W. Biederman wrote:
> "Michael Kerrisk (man-pages)"  writes:
> 
>> Hello Eric,
>>
>> Thank you. I was hoping you might jump in on this thread.
>>
>> Please see below.
>>
>> On 10/9/19 10:46 AM, Eric W. Biederman wrote:
>>> "Michael Kerrisk (man-pages)"  writes:
>>>
 Hello Philipp,

 My apologies that it has taken a while to reply. (I had been hoping
 and waiting that a few more people might weigh in on this thread.)

 On 9/23/19 3:42 PM, Philipp Wendler wrote:
> Hello Michael,
>
> Am 23.09.19 um 14:04 schrieb Michael Kerrisk (man-pages):
>
>> I'm considering to rewrite these pieces to exactly
>> describe what the system call does (which I already
>> do in the third paragraph) and remove the "may or may not"
>> pieces in the second paragraph. I'd welcome comments
>> on making that change.
>>
>> What did you think about my proposal above? To put it in context,
>> this was my initial comment in the mail:
>>
>> [[
>> One area of the page that I'm still not really happy with
>> is the "vague" wording in the second paragraph and the note
>> in the third paragraph about the system call possibly
>> changing. These pieces survive (in somewhat modified form)
>> from the original page, which was written before the
>> system call was released, and it seems there was some
>> question about whether the system call might still change
>> its behavior with respect to the root directory and current
>> working directory of other processes. However, after 19
>> years, nothing has changed, and surely it will not in the
>> future, since that would constitute an ABI breakage.
>> I'm considering to rewrite these pieces to exactly
>> describe what the system call does (which I already
>> do in the third paragraph) and remove the "may or may not"
>> pieces in the second paragraph. I'd welcome comments
>> on making that change.
>> ]]
>>
>> And the second and third paragraphs of the manual page currently
>> read:
>>
>> [[
>>pivot_root()  may  or may not change the current root and the cur‐
>>rent working directory of any processes or threads  that  use  the
>>old  root  directory  and which are in the same mount namespace as
>>the caller of pivot_root().  The  caller  of  pivot_root()  should
>>ensure  that  processes  with root or current working directory at
>>the old root operate correctly in either case.   An  easy  way  to
>>ensure  this is to change their root and current working directory
>>to  new_root  before  invoking  pivot_root().   Note   also   that
>>pivot_root()  may  or may not affect the calling process's current
>>working directory.  It is therefore recommended to call chdir("/")
>>immediately after pivot_root().
>>
>>The  paragraph  above  is  intentionally vague because at the time
>>when pivot_root() was first implemented, it  was  unclear  whether
>>its  affect  on  other process's root and current working directo‐
>>ries—and the caller's current working  directory—might  change  in
>>the  future.   However, the behavior has remained consistent since
>>this system call was first implemented: pivot_root()  changes  the
>>root  directory  and the current working directory of each process
>>or thread in the same mount namespace to new_root if they point to
>>the  old  root  directory.   (See also NOTES.)  On the other hand,
>>pivot_root() does not change the caller's current  working  direc‐
>>tory  (unless it is on the old root directory), and thus it should
>>be followed by a chdir("/") call.
>> ]]
> 
> Apologies I saw that concern I didn't realize it was a questio
> 
> I think it is very reasonable to remove warning the behavior might
> change.  We have pivot_root(8) in common use that to use it requires
> the semantic of changing processes other than the current process.
> Which means any attempt to noticably change the behavior of
> pivot_root(2) will break userspace.

Thanks for the confirmation that this change would be okay.
I will make this change soon, unless I hear a counterargument.

> Now the documented semantics in behavior above are not quite what
> pivot_root(2) does.  It walks all processes on the system and if the
> working directory or the root directory refer to the root mount that is
> being replaced, then pivot_root(2) will update them.
> 
> In practice the above is limited to a mount namespace.  But something as
> simple as "cd /proc//root" can allow a process to have a
> working directory in a different mount namespace.

So, I'm not quite clear. Do you mean that something in the existing
manual page text should change? If so, could you describe the
needed change please?

> Because ``unprivileged'' users can now use pivot_root(2) we may want to
> rethink the implementation at some point to be cheaper than a global

Re: For review: rewritten pivot_root(2) manual page

2019-10-09 Thread Eric W. Biederman
"Michael Kerrisk (man-pages)"  writes:

> Hello Eric,
>
> Thank you. I was hoping you might jump in on this thread.
>
> Please see below.
>
> On 10/9/19 10:46 AM, Eric W. Biederman wrote:
>> "Michael Kerrisk (man-pages)"  writes:
>> 
>>> Hello Philipp,
>>>
>>> My apologies that it has taken a while to reply. (I had been hoping
>>> and waiting that a few more people might weigh in on this thread.)
>>>
>>> On 9/23/19 3:42 PM, Philipp Wendler wrote:
 Hello Michael,

 Am 23.09.19 um 14:04 schrieb Michael Kerrisk (man-pages):

> I'm considering to rewrite these pieces to exactly
> describe what the system call does (which I already
> do in the third paragraph) and remove the "may or may not"
> pieces in the second paragraph. I'd welcome comments
> on making that change.
>
> What did you think about my proposal above? To put it in context,
> this was my initial comment in the mail:
>
> [[
> One area of the page that I'm still not really happy with
> is the "vague" wording in the second paragraph and the note
> in the third paragraph about the system call possibly
> changing. These pieces survive (in somewhat modified form)
> from the original page, which was written before the
> system call was released, and it seems there was some
> question about whether the system call might still change
> its behavior with respect to the root directory and current
> working directory of other processes. However, after 19
> years, nothing has changed, and surely it will not in the
> future, since that would constitute an ABI breakage.
> I'm considering to rewrite these pieces to exactly
> describe what the system call does (which I already
> do in the third paragraph) and remove the "may or may not"
> pieces in the second paragraph. I'd welcome comments
> on making that change.
> ]]
>
> And the second and third paragraphs of the manual page currently
> read:
>
> [[
>pivot_root()  may  or may not change the current root and the cur‐
>rent working directory of any processes or threads  that  use  the
>old  root  directory  and which are in the same mount namespace as
>the caller of pivot_root().  The  caller  of  pivot_root()  should
>ensure  that  processes  with root or current working directory at
>the old root operate correctly in either case.   An  easy  way  to
>ensure  this is to change their root and current working directory
>to  new_root  before  invoking  pivot_root().   Note   also   that
>pivot_root()  may  or may not affect the calling process's current
>working directory.  It is therefore recommended to call chdir("/")
>immediately after pivot_root().
>
>The  paragraph  above  is  intentionally vague because at the time
>when pivot_root() was first implemented, it  was  unclear  whether
>its  affect  on  other process's root and current working directo‐
>ries—and the caller's current working  directory—might  change  in
>the  future.   However, the behavior has remained consistent since
>this system call was first implemented: pivot_root()  changes  the
>root  directory  and the current working directory of each process
>or thread in the same mount namespace to new_root if they point to
>the  old  root  directory.   (See also NOTES.)  On the other hand,
>pivot_root() does not change the caller's current  working  direc‐
>tory  (unless it is on the old root directory), and thus it should
>be followed by a chdir("/") call.
> ]]

Apologies I saw that concern I didn't realize it was a questio

I think it is very reasonable to remove warning the behavior might
change.  We have pivot_root(8) in common use that to use it requires
the semantic of changing processes other than the current process.
Which means any attempt to noticably change the behavior of
pivot_root(2) will break userspace.

Now the documented semantics in behavior above are not quite what
pivot_root(2) does.  It walks all processes on the system and if the
working directory or the root directory refer to the root mount that is
being replaced, then pivot_root(2) will update them.

In practice the above is limited to a mount namespace.  But something as
simple as "cd /proc//root" can allow a process to have a
working directory in a different mount namespace.

Because ``unprivileged'' users can now use pivot_root(2) we may want to
rethink the implementation at some point to be cheaper than a global
process walk.  So far that process walk has not been a problem in
practice.

If we had to write pivot_root(2) from scratch limiting it to just
changing the root directory of the process that calls pivot_root(2)
would have been the superior semantic.  That would have required run
pivot_root(8) like: "exec pivot_root . . -- /bin/bash ..."  but it would
not have required walking every thread in the system.

 I think that it would make the man pa

Re: For review: rewritten pivot_root(2) manual page

2019-10-09 Thread Michael Kerrisk (man-pages)
Hello Eric,

Thank you. I was hoping you might jump in on this thread.

Please see below.

On 10/9/19 10:46 AM, Eric W. Biederman wrote:
> "Michael Kerrisk (man-pages)"  writes:
> 
>> Hello Philipp,
>>
>> My apologies that it has taken a while to reply. (I had been hoping
>> and waiting that a few more people might weigh in on this thread.)
>>
>> On 9/23/19 3:42 PM, Philipp Wendler wrote:
>>> Hello Michael,
>>>
>>> Am 23.09.19 um 14:04 schrieb Michael Kerrisk (man-pages):
>>>
 I'm considering to rewrite these pieces to exactly
 describe what the system call does (which I already
 do in the third paragraph) and remove the "may or may not"
 pieces in the second paragraph. I'd welcome comments
 on making that change.

What did you think about my proposal above? To put it in context,
this was my initial comment in the mail:

[[
One area of the page that I'm still not really happy with
is the "vague" wording in the second paragraph and the note
in the third paragraph about the system call possibly
changing. These pieces survive (in somewhat modified form)
from the original page, which was written before the
system call was released, and it seems there was some
question about whether the system call might still change
its behavior with respect to the root directory and current
working directory of other processes. However, after 19
years, nothing has changed, and surely it will not in the
future, since that would constitute an ABI breakage.
I'm considering to rewrite these pieces to exactly
describe what the system call does (which I already
do in the third paragraph) and remove the "may or may not"
pieces in the second paragraph. I'd welcome comments
on making that change.
]]

And the second and third paragraphs of the manual page currently
read:

[[
   pivot_root()  may  or may not change the current root and the cur‐
   rent working directory of any processes or threads  that  use  the
   old  root  directory  and which are in the same mount namespace as
   the caller of pivot_root().  The  caller  of  pivot_root()  should
   ensure  that  processes  with root or current working directory at
   the old root operate correctly in either case.   An  easy  way  to
   ensure  this is to change their root and current working directory
   to  new_root  before  invoking  pivot_root().   Note   also   that
   pivot_root()  may  or may not affect the calling process's current
   working directory.  It is therefore recommended to call chdir("/")
   immediately after pivot_root().

   The  paragraph  above  is  intentionally vague because at the time
   when pivot_root() was first implemented, it  was  unclear  whether
   its  affect  on  other process's root and current working directo‐
   ries—and the caller's current working  directory—might  change  in
   the  future.   However, the behavior has remained consistent since
   this system call was first implemented: pivot_root()  changes  the
   root  directory  and the current working directory of each process
   or thread in the same mount namespace to new_root if they point to
   the  old  root  directory.   (See also NOTES.)  On the other hand,
   pivot_root() does not change the caller's current  working  direc‐
   tory  (unless it is on the old root directory), and thus it should
   be followed by a chdir("/") call.
]]

>>> I think that it would make the man page significantly easier to
>>> understand if if the vague wording and the meta discussion about it are
>>> removed.
>>
>> It is my inclination to make this change, but I'd love to get more
>> feedback on this point.
>>
 DESCRIPTION
>>> [...]>pivot_root()  changes  the
root  directory  and the current working directory of each process
or thread in the same mount namespace to new_root if they point to
the  old  root  directory.   (See also NOTES.)  On the other hand,
pivot_root() does not change the caller's current  working  direc‐
tory  (unless it is on the old root directory), and thus it should
be followed by a chdir("/") call.
>>>
>>> There is a contradiction here with the NOTES (cf. below).
>>
>> See below.
>>
The following restrictions apply:

-  new_root and put_old must be directories.

-  new_root and put_old must not be on the same filesystem as  the
   current root.  In particular, new_root can't be "/" (but can be
   a bind mounted directory on the current root filesystem).
>>>
>>> Wouldn't "must not be on the same mountpoint" or something similar be
>>> more clear, at least for new_root? The note in parentheses indicates
>>> that new_root can actually be on the same filesystem as the current
>>> note. However, ...
>>
>> For 'put_old', it really is "filesystem".
> 
> If we are going to be pedantic "filesystem" is really the wrong concept

Re: For review: rewritten pivot_root(2) manual page

2019-10-09 Thread Eric W. Biederman
"Michael Kerrisk (man-pages)"  writes:

> Hello Philipp,
>
> My apologies that it has taken a while to reply. (I had been hoping
> and waiting that a few more people might weigh in on this thread.)
>
> On 9/23/19 3:42 PM, Philipp Wendler wrote:
>> Hello Michael,
>> 
>> Am 23.09.19 um 14:04 schrieb Michael Kerrisk (man-pages):
>> 
>>> I'm considering to rewrite these pieces to exactly
>>> describe what the system call does (which I already
>>> do in the third paragraph) and remove the "may or may not"
>>> pieces in the second paragraph. I'd welcome comments
>>> on making that change.
>> 
>> I think that it would make the man page significantly easier to
>> understand if if the vague wording and the meta discussion about it are
>> removed.
>
> It is my inclination to make this change, but I'd love to get more
> feedback on this point.
>
>>> DESCRIPTION
>> [...]>pivot_root()  changes  the
>>>root  directory  and the current working directory of each process
>>>or thread in the same mount namespace to new_root if they point to
>>>the  old  root  directory.   (See also NOTES.)  On the other hand,
>>>pivot_root() does not change the caller's current  working  direc‐
>>>tory  (unless it is on the old root directory), and thus it should
>>>be followed by a chdir("/") call.
>> 
>> There is a contradiction here with the NOTES (cf. below).
>
> See below.
>
>>>The following restrictions apply:
>>>
>>>-  new_root and put_old must be directories.
>>>
>>>-  new_root and put_old must not be on the same filesystem as  the
>>>   current root.  In particular, new_root can't be "/" (but can be
>>>   a bind mounted directory on the current root filesystem).
>> 
>> Wouldn't "must not be on the same mountpoint" or something similar be
>> more clear, at least for new_root? The note in parentheses indicates
>> that new_root can actually be on the same filesystem as the current
>> note. However, ...
>
> For 'put_old', it really is "filesystem".

If we are going to be pedantic "filesystem" is really the wrong concept
here.  The section about bind mount clarifies it, but I wonder if there
is a better term.

I think I would say: "new_root and put_old must not be on the same mount
as the current root."

I think using "mount" instead of "filesystem" keeps the concepts less
confusing.

As I am reading through this email and seeing text that is trying to be
precise and clear then hitting the term "filesystem" is a bit jarring.
pivot_root doesn't care a thing for file systems.  pivot_root only cares
about mounts.

And by a "mount" I mean the thing that you get when you create a bind
mount or you call mount normally.

Michael do you have man pages for the new mount api yet?

> For 'new_root', see below.
>
>>>-  put_old must be at or underneath new_root; that  is,  adding  a
>>>   nonnegative  number  of /.. to the string pointed to by put_old
>>>   must yield the same directory as new_root.
>>>
>>>-  new_root must be a mount point.  (If  it  is  not  otherwise  a
>>>   mount  point,  it  suffices  to  bind  mount new_root on top of
>>>   itself.)
>> 
>> ... this item actually makes the above item almost redundant regarding
>> new_root (except for the "/") case. So one could replace this item with
>> something like this:
>> 
>> - new_root must be a mount point different from "/". (If it is not
>>   otherwise a mount point, it suffices  to bind mount new_root on top
>>   of itself.)
>> 
>> The above item would then only mention put_old (and maybe use clarified
>> wording on whether actually a different file system is necessary for
>> put_old or whether a different mount point is enough).
>
> Thanks. That's a good suggestion. I simplified the earlier bullet
> point as you suggested, and changed the text here to say:
>
>-  new_root must be a mount point, but can't be "/".  If it is not
>   otherwise  a mount point, it suffices to bind mount new_root on
>   top of itself.  (new_root can be a bind  mounted  directory  on
>   the current root filesystem.)

How about:
  - new_root must be the path to a mount, but can't be "/".  Any
  path that is not already a mount can be converted into one by
  bind mounting the path onto itself.
  
>>> NOTES
>> [...]
>>>pivot_root() allows the caller to switch to a new root  filesystem
>>>while  at  the  same time placing the old root mount at a location
>>>under new_root from where it can subsequently be unmounted.   (The
>>>fact  that  it  moves  all processes that have a root directory or
>>>current working directory on the old root filesystem  to  the  new
>>>root  filesystem  frees the old root filesystem of users, allowing
>>>it to be unmounted more easily.)
>> 
>> Here is the contradiction:
>> The DESCRIPTION says that root and current working di

Re: For review: rewritten pivot_root(2) manual page

2019-10-09 Thread G. Branden Robinson
At 2019-10-09T09:41:34+0200, Michael Kerrisk (man-pages) wrote:
> I'm not sure. Some people have a bit of trouble to wrap their head
> around the pivot_root(".", ".") idea. (I possibly am one of them.)
> I'd be quite keen to hear other opinions on this. Unfortunately,
> few people have commented on this manual page rewrite.

pivot_root(".", ".") seems as ineffable to me as chdir(".").

Meaning mostly, but not completely.

I have an external drive with a USB cable that's a little dodgy.  If it
moves around a bit the external drive gets auto-unmounted, and then
remounted in the same place, so I can experience the otherwise-baffling
shell experience of:

[disconnect/reconnect happens; the device is mounted again now]
$ ls .
Input/output error
$ cd .
$ ls .
[perfectly fine listing]

What's happened is that the meaning of "." has subtly changed in a way
that I suppose would never have been seen back in Version 7 Unix days.
Maybe I've been reading too much historical documentation (I'm currently
enjoying McKusick et al.'s _Design and Implementation of the 4.4BSD
Operations System_), but the way we describe and teach Unixlike systems
in operating systems classes and, more to the point, in our man pages I
think continues to be strongly informed by the invariants we learned in
our youth, and which are slowly but steadily being invalidated.

Concretely, I recommend having pivot_root(".", ".") in the man page as
an example, but perhaps as an alternate.  Because it is
counterintuitive (to some minds), it's worth spending some time to
explain it.  But I would offer it because it's a valid use of the system
call and because it makes sense to a domain expert (Eric Biedermann).

I would try to offer an explanation myself but I lack the understanding.
_If_ I'm following the discussion correctly, which I doubt, then what I
imagine to happen is that a sequence point occurs between the function
parameters, and "." changes its meaning as with my "cd ." example above.
I am probably reasoning by analogy, and perhaps not by a good one.

Also, it is okay if the language of this page continues to evolve over
time.  I appreciate your desire to get it "perfect" (or at least to some
local optimum) now since you're most of the way through an overhaul of
it, but it is not just the system that changes with time--the audience
does too.

Maybe in 5 or 10 years, the kids will be au fait with pivot_root(".",
".") and only some graybeards will continue to think of it as a bit
strange.

Regards,
Branden


signature.asc
Description: PGP signature


Re: For review: rewritten pivot_root(2) manual page

2019-10-09 Thread Michael Kerrisk (man-pages)
Hello Philipp,

My apologies that it has taken a while to reply. (I had been hoping
and waiting that a few more people might weigh in on this thread.)

On 9/23/19 3:42 PM, Philipp Wendler wrote:
> Hello Michael,
> 
> Am 23.09.19 um 14:04 schrieb Michael Kerrisk (man-pages):
> 
>> I'm considering to rewrite these pieces to exactly
>> describe what the system call does (which I already
>> do in the third paragraph) and remove the "may or may not"
>> pieces in the second paragraph. I'd welcome comments
>> on making that change.
> 
> I think that it would make the man page significantly easier to
> understand if if the vague wording and the meta discussion about it are
> removed.

It is my inclination to make this change, but I'd love to get more
feedback on this point.

>> DESCRIPTION
> [...]>pivot_root()  changes  the
>>root  directory  and the current working directory of each process
>>or thread in the same mount namespace to new_root if they point to
>>the  old  root  directory.   (See also NOTES.)  On the other hand,
>>pivot_root() does not change the caller's current  working  direc‐
>>tory  (unless it is on the old root directory), and thus it should
>>be followed by a chdir("/") call.
> 
> There is a contradiction here with the NOTES (cf. below).

See below.

>>The following restrictions apply:
>>
>>-  new_root and put_old must be directories.
>>
>>-  new_root and put_old must not be on the same filesystem as  the
>>   current root.  In particular, new_root can't be "/" (but can be
>>   a bind mounted directory on the current root filesystem).
> 
> Wouldn't "must not be on the same mountpoint" or something similar be
> more clear, at least for new_root? The note in parentheses indicates
> that new_root can actually be on the same filesystem as the current
> note. However, ...

For 'put_old', it really is "filesystem".

For 'new_root', see below.

>>-  put_old must be at or underneath new_root; that  is,  adding  a
>>   nonnegative  number  of /.. to the string pointed to by put_old
>>   must yield the same directory as new_root.
>>
>>-  new_root must be a mount point.  (If  it  is  not  otherwise  a
>>   mount  point,  it  suffices  to  bind  mount new_root on top of
>>   itself.)
> 
> ... this item actually makes the above item almost redundant regarding
> new_root (except for the "/") case. So one could replace this item with
> something like this:
> 
> - new_root must be a mount point different from "/". (If it is not
>   otherwise a mount point, it suffices  to bind mount new_root on top
>   of itself.)
> 
> The above item would then only mention put_old (and maybe use clarified
> wording on whether actually a different file system is necessary for
> put_old or whether a different mount point is enough).

Thanks. That's a good suggestion. I simplified the earlier bullet
point as you suggested, and changed the text here to say:

   -  new_root must be a mount point, but can't be "/".  If it is not
  otherwise  a mount point, it suffices to bind mount new_root on
  top of itself.  (new_root can be a bind  mounted  directory  on
  the current root filesystem.)

>> NOTES
> [...]
>>pivot_root() allows the caller to switch to a new root  filesystem
>>while  at  the  same time placing the old root mount at a location
>>under new_root from where it can subsequently be unmounted.   (The
>>fact  that  it  moves  all processes that have a root directory or
>>current working directory on the old root filesystem  to  the  new
>>root  filesystem  frees the old root filesystem of users, allowing
>>it to be unmounted more easily.)
> 
> Here is the contradiction:
> The DESCRIPTION says that root and current working dir are only changed
> "if they point to the old root directory". Here in the NOTES it says
> that any root or working directories on the old root file system (i.e.,
> even if somewhere below the root) are changed.
> 
> Which is correct?

The first text is correct. I must have accidentally inserted
"filesystem" into the paragraph just here during a global edit.
Thanks for catching that.

> If it indeed affects all processes with root and/or current working
> directory below the old root, the text here does not clearly state what
> the new root/current working directory of theses processes is.
> E.g., if a process is at /foo and we pivot to /bar, will the process be
> moved to /bar (i.e., at / after pivot_root), or will the kernel attempt
> to move it to some location like /bar/foo? Because the latter might not
> even exist, I suspect that everything is just moved to new_root, but
> this could be stated explicitly by replacing "to the new root
> filesystem" in the above paragraph with "to the new root directory"
> (after checking whether this is true).

The text here now reads:

 

Re: For review: rewritten pivot_root(2) manual page

2019-09-23 Thread Philipp Wendler
Hello Michael,

Am 23.09.19 um 14:04 schrieb Michael Kerrisk (man-pages):

> I'm considering to rewrite these pieces to exactly
> describe what the system call does (which I already
> do in the third paragraph) and remove the "may or may not"
> pieces in the second paragraph. I'd welcome comments
> on making that change.

I think that it would make the man page significantly easier to
understand if if the vague wording and the meta discussion about it are
removed.

> DESCRIPTION
[...]>pivot_root()  changes  the
>root  directory  and the current working directory of each process
>or thread in the same mount namespace to new_root if they point to
>the  old  root  directory.   (See also NOTES.)  On the other hand,
>pivot_root() does not change the caller's current  working  direc‐
>tory  (unless it is on the old root directory), and thus it should
>be followed by a chdir("/") call.

There is a contradiction here with the NOTES (cf. below).

>The following restrictions apply:
> 
>-  new_root and put_old must be directories.
> 
>-  new_root and put_old must not be on the same filesystem as  the
>   current root.  In particular, new_root can't be "/" (but can be
>   a bind mounted directory on the current root filesystem).

Wouldn't "must not be on the same mountpoint" or something similar be
more clear, at least for new_root? The note in parentheses indicates
that new_root can actually be on the same filesystem as the current
note. However, ...

>-  put_old must be at or underneath new_root; that  is,  adding  a
>   nonnegative  number  of /.. to the string pointed to by put_old
>   must yield the same directory as new_root.
> 
>-  new_root must be a mount point.  (If  it  is  not  otherwise  a
>   mount  point,  it  suffices  to  bind  mount new_root on top of
>   itself.)

... this item actually makes the above item almost redundant regarding
new_root (except for the "/") case. So one could replace this item with
something like this:

- new_root must be a mount point different from "/". (If it is not
  otherwise a mount point, it suffices  to bind mount new_root on top
  of itself.)

The above item would then only mention put_old (and maybe use clarified
wording on whether actually a different file system is necessary for
put_old or whether a different mount point is enough).

> NOTES
[...]
>pivot_root() allows the caller to switch to a new root  filesystem
>while  at  the  same time placing the old root mount at a location
>under new_root from where it can subsequently be unmounted.   (The
>fact  that  it  moves  all processes that have a root directory or
>current working directory on the old root filesystem  to  the  new
>root  filesystem  frees the old root filesystem of users, allowing
>it to be unmounted more easily.)

Here is the contradiction:
The DESCRIPTION says that root and current working dir are only changed
"if they point to the old root directory". Here in the NOTES it says
that any root or working directories on the old root file system (i.e.,
even if somewhere below the root) are changed.

Which is correct?

If it indeed affects all processes with root and/or current working
directory below the old root, the text here does not clearly state what
the new root/current working directory of theses processes is.
E.g., if a process is at /foo and we pivot to /bar, will the process be
moved to /bar (i.e., at / after pivot_root), or will the kernel attempt
to move it to some location like /bar/foo? Because the latter might not
even exist, I suspect that everything is just moved to new_root, but
this could be stated explicitly by replacing "to the new root
filesystem" in the above paragraph with "to the new root directory"
(after checking whether this is true).

> EXAMPLE>The program below demonstrates the use of  pivot_root()  
> inside  a
>mount namespace that is created using clone(2).  After pivoting to
>the root directory named in the program's first command-line argu‐
>ment,  the  child  created  by  clone(2) then executes the program
>named in the remaining command-line arguments.

Why not use the pivot_root(".", ".") in the example program?
It would make the example shorter, and also works if the process cannot
write to new_root (e..g., in a user namespace).

Regards,
Philipp