On Fri, Apr 29, 2022 at 6:57 PM Jason A. Donenfeld <ja...@zx2c4.com> wrote:
> On Fri, Apr 29, 2022 at 6:04 PM Denys Vlasenko <vda.li...@googlemail.com> 
> wrote:
> > On Wed, Apr 27, 2022 at 6:55 PM Jason A. Donenfeld <ja...@zx2c4.com> wrote:
> > > On Wed, Apr 27, 2022 at 06:15:50PM +0200, Denys Vlasenko wrote:
> > > >         if ((unlink(filename) < 0 || fsync(dfd) < 0) && seed_len) {
> > > >                 bb_perror_msg("can't%s seed", " remove");
> > > >                 return -1;
> > > >         }
> > > >
> > > > Why can't the entire above if() be replaced with xunlink(filename) ?
> > >
> > > It cannot be replaced with that. The fsync() must happen before other
> > > operations move forward,
> >
> > Why? It should be explained in a comment, and explained well enough
> > so that future developers do not repeatedly try to remove it
> > because "I don't see why it's necessary".
>
> I'll add a comment and submit v2 of my recent patch.

It'll be faster if we just discuss it in this thread.
Describe a realistic scenario where having this fsync()
prevents a problem.

> > > and exiting the program isn't necessarily the
> > > correct course of action.

Of course. I'm not saying that "immediately exit with error message"
_always_ is the correct approach.

What I saw a lot over the years is that error paths
tend to be not well-tested, and nevertheless, even when they
are not buggy per se, the error conditions they are reacting to
are often a part of a bigger problem (usually beyond the ability
of the code in question to detect, much less sensibly correct).

For example, if you get a ENOSPC on write, you might be
tempted to return a special exit code signifying this specific error.

However.
I'd bet you real money that NOT ONE of callers of your tool
will ever bother checking specifically for this exit code.
At best, the callers will react uniformly for any non-zero exit code as
"something went wrong". Quite often, check for !0 exit code will be
entirely forgotten.

What *is* important in this particular example, in practice?
Exiting with nonzero exitcode (any nonzero would do, say 1)
and _printing perror_ - in practice, that's what makes users
understand what exactly the problem is (in this case,
"aha, the disk is full!!!")

(A counter-example when "exit wih error" is totally wrong
is a database server application: on ENOSPC, it almost certainly
should NOT exit! Oracle DB, for example, will wait until
the problem is fixed).

> > Let's see.
> > If we opened the file, read it, and then failed to unlink it,
> > it means that filesystem is read-only, there is a permission problem
> > (e.g. SELinux), or there is an I/O error.
> > All of these cases mean that the system is not in expected state,
> > and further efforts to "continue as normal" (e.g. try to rewrite
> > /var/lib/seedrng/seed.credit) do not seem likely to be productive to me.
> > I think it's better to just complain and exit.
>
> No, this isn't okay. It makes it impossible to distinguish which of
> the seeds was used by the caller of the program from inspecting the
> error code,

Can you show me an existing user of seedrng which checks for this
specific exit code?

> and it means that it can't attempt the second one if the
> first fails (because of, say, the immutable file attribute being set).

If your system starts getting immutable attributes semi-randomly
set on some files but not others, the mitigation of using the second file
to successfully seed the RNG seems like polishing Titanic's bronze.
We need to let the user know there's an unexpected problem
(and we do). Continuing trying to work is not that important.
_______________________________________________
busybox mailing list
busybox@busybox.net
http://lists.busybox.net/mailman/listinfo/busybox

Reply via email to