Re: safe_rename() and verifying the result of link(2)

2018-08-24 Thread Vincent Lefevre
On 2018-08-22 15:35:16 +0200, Steffen Nurpmeso wrote:
> Vincent Lefevre wrote in <20180821230229.ga16...@zira.vinc17.org>:
>  |* It is not clear whether this has ever been usuful (nothing in
>  |  comment or in the commit log).
> 
> According to some statement on a list you are also subscribed to
> this NFS problem has been fixed "at the early 90s".

This is quite surprising. Even the discussion at

  
https://www.experts-exchange.com/questions/10078625/atomic-locking-over-NFS-with-link-2-stat-2.html

in 1998 doesn't mention this problem (it just mentions the case where
the link(2) call failed though the link was created).

Several old documents say not to use the return value of the link(2)
call, but do not say whether this is only due to the case where
link(2) fails but the link was created, or the other way round.

-- 
Vincent Lefèvre  - Web: 
100% accessible validated (X)HTML - Blog: 
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)


Re: safe_rename() and verifying the result of link(2)

2018-08-24 Thread Vincent Lefevre
On 2018-08-23 14:50:31 -0500, Derek Martin wrote:
> The former (ignoring link's return value and always doing the stat
> comparison) is probably safest for most users.  The latter will
> apparently be less problematic for SSHFS-like quirks, but as we saw
> there's an SSHFS option that makes it work without this behavior.

You have no proof of that. If I understand correctly, you suggest to
ignore link's return value in all cases because of a potential race
condition on some old system. But then, if you use the SSHFS option
to disable hard links, so that Mutt will use rename() instead, then
you also need to consider race conditions it can introduce... If
you think that rename() is at least as good as the link() stuff,
then safe_rename is useless: you can just use rename(). :-)

-- 
Vincent Lefèvre  - Web: 
100% accessible validated (X)HTML - Blog: 
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)


Re: safe_rename() and verifying the result of link(2)

2018-08-24 Thread Steffen Nurpmeso
Vincent Lefevre wrote in <20180824091156.ga20...@zira.vinc17.org>:
 |On 2018-08-22 15:35:16 +0200, Steffen Nurpmeso wrote:
 |> Vincent Lefevre wrote in <20180821230229.ga16...@zira.vinc17.org>:
 |>|* It is not clear whether this has ever been usuful (nothing in
 |>|  comment or in the commit log).
 |> 
 |> According to some statement on a list you are also subscribed to
 |> this NFS problem has been fixed "at the early 90s".
 |
 |This is quite surprising. Even the discussion at
 |
 |  https://www.experts-exchange.com/questions/10078625/atomic-locking-over-\
 |  NFS-with-link-2-stat-2.html
 |
 |in 1998 doesn't mention this problem (it just mentions the case where
 |the link(2) call failed though the link was created).
 |
 |Several old documents say not to use the return value of the link(2)
 |call, but do not say whether this is only due to the case where
 |link(2) fails but the link was created, or the other way round.

Oh, wait!  This was false rememberance, i referred to a message
from Casper Dik of Oracle who wrote on 2015-12-31

  >/* Create a unique file. O_EXCL does not really work over NFS so we follow
  > * the following trick (inspired by S.R. van den Berg):
  > * - make a mostly unique filename and try to create it
  > * - link the unique filename to our target
  > * - get the link count of the target
  > * - unlink the mostly unique filename
  > * - if the link count was 2, then we are ok; else we've failed */

  The problem of not being able to create a file with O_EXCL was, I think,
  fixed in NFSv3 (if not, certainly in NFSv4)

  Casper

so this was not about link but about O_EXCL.  About a year later
(2016-11-02) there was a pair of message in between Stèphane and
Jörg about links via NFS, as in "IIRC there were issues with ln on
NFS for instance." and "Could you please explain what you have in
mind?  I would like to understand whether there really is a NFS
problem or whether there is just a NFS bug in Linux", but nothing
more than that.

--steffen
|
|Der Kragenbaer,The moon bear,
|der holt sich munter   he cheerfully and one by one
|einen nach dem anderen runter  wa.ks himself off
|(By Robert Gernhardt)


Re: safe_rename() and verifying the result of link(2)

2018-08-24 Thread Derek Martin
On Fri, Aug 24, 2018 at 11:35:50AM +0200, Vincent Lefevre wrote:
> On 2018-08-23 14:50:31 -0500, Derek Martin wrote:
> > The former (ignoring link's return value and always doing the stat
> > comparison) is probably safest for most users.  The latter will
> > apparently be less problematic for SSHFS-like quirks, but as we saw
> > there's an SSHFS option that makes it work without this behavior.
> 
> You have no proof of that.

No proof of what?  I think I can prove that everything I said is
documented, or follows logically from things that are documented.  I
think mostly, I've already done so, though perhaps some of the sources
need to be referenced.

> If I understand correctly, you suggest to ignore link's return value
> in all cases because of a potential race condition on some old
> system.

You do not understand correctly.

The main issue is, and always was, the unreliability of the return
value of link() over NFS, which the man page on Linux indicates is a
current problem.  There's also ideficiency of clarity about how that
unreliability manifests on Linux, and a general lack of info about the
problem from a portability perspective.

The purpose of safe_rename() (according to its comments) is to attempt
to provide an equivalent option that is more reliable over NFS.  But
there's a reliable test using stat(), irrespective of the reliability
of the return value of link(), to determine if link() actually worked,
which is known to work over NFS.  So we should ignore the link()
return value and use the reliable stat() test, always.  

The exception to this is SSHFS, which (IIUC) depending on its
configuration, fakes link() by copying the file.  Doing that would
obviously make the stat() check ALWAYS fail.  I would propose that we
ignore SSHFS since it's a hack for which we know that semantics that
POSIX requires are not supported.  And as you say, it seems to have a
workaround that works well enough considering it's a hack.

But, in the event someone thinks SSHFS should be supported as a
first-class file system, the alternative I suggested should
work better for SSHFS because the stat() test will always fail.  The
alternative may fail randomly if the NFS server crashes, but a) that
will be far more rare than always, and b) SSHFS is a hack so who
cares?

Is that clear?

> But then, if you use the SSHFS option to disable hard links, so that
> Mutt will use rename() instead, then you also need to consider race
> conditions it can introduce...

1. No we don't, because SSHFS is a hack and who cares?  The user
   should be prepared to deal with its inability to provide semantics
   that POSIX requires.

2. I've previously said that rename() is NOT as good as link(), and
   should be used only as a last resort.  The algorithm (both
   versions) I described take that into account.

That said, in practice, I actually think it would be fine to eliminate
safe_rename() entirely and just use rename().  The problem that it
tries to solve should be sufficiently rare that I think it's
reasonable to make the user deal with it manually.  And as I've said
many times in the (mostly distant) past, reading your mail over NFS is
stupid anyway, precisely because of problems like these.  Probably
many people care about Mutt working correctly over NFS (as much as
possible)... I am not one of them.

-- 
Derek D. Martinhttp://www.pizzashack.org/   GPG Key ID: 0xDFBEAD02
-=-=-=-=-
This message is posted from an invalid address.  Replying to it will result in
undeliverable mail due to spam prevention.  Sorry for the inconvenience.



pgp5p3LPBdGQ7.pgp
Description: PGP signature