On Sat, Nov 28, 2020 at 01:41:50PM -0800, Philip Guenther wrote:
> On Fri, Nov 27, 2020 at 10:35 PM Philip Guenther <[email protected]> wrote:
> ...
>
> > So yeah, maybe it does work to:
> > 1) make __sfp() FLOCKFILE() the allocated FILE before unlocking sfp_mutex
> > 2) update f{,d,mem,un}open() and open_*memstream() to match (1), unlocking
> > in all paths when the FILE is safe to be accessed by _fwalk(), and
> > locking
> > sfp_mutex around the zeroing of _flags.
> > 3) make fclose() and freopen() also lock sfp_mutex around the zeroing of
> > _flags
> > (should add an _frelease() to findfp.c that does this dance for (2) and
> > (3))
> > 4) make _fwalk() take sfp_mutex, and maybe also a FILE* so the setting of
> > __SIGN can be done under the lock?
> >
> 5) decide how/whether to handle adjust the FLOCKFILE placement in the
> > _fwalk()
> > tree: is the testing of the "is line-buffered" flag in lflush() safe
> > without
> > a lock? Mumble...
> >
>
> After thinking through states more, #4 isn't safe: _fwalk() can't take
> sfp_mutex because it's called from __srefill() which has its FILE locked,
> which would reverse the order: two concurrent calls to __srefill() from
> line-buffered FILEs could have one in _fwalk() blocking on locking the
> other, while the other blocks on the sfp_mutex for _fwalk().
This part in __srefill():
/*
* Before reading from a line buffered or unbuffered file,
* flush all line buffered output files, per the ANSI C
* standard.
*/
if (fp->_flags & (__SLBF|__SNBF)) {
/* Ignore this file in _fwalk to avoid potential deadlock. */
fp->_flags |= __SIGN;
(void) _fwalk(lflush);
fp->_flags &= ~__SIGN;
/* Now flush this file without locking it. */
if ((fp->_flags & (__SLBF|__SWR)) == (__SLBF|__SWR))
__sflush(fp);
}
seems to confound all sensible locking hierarchies.
You need to lock the FILE you're trying to refill. Then you need to
check how it is buffered. If it is buffered in a particular way, this
is supposed to trigger a flush on *all other* line-buffered FILEs?
How can you do that without possible deadlock without first yielding
the lock for the FILE in question? And then you've got a race?
Where in the standard(s) is this behavior required? I'm not even sure
how to look for it.
> Hmm, there's currently a loop between two __srefill() calls like that, as
> there's nothing to force visibility of the __SIGN flag between CPUs so they
> could try to lock each other. Grrr.
>
> Time to check other BSDs and see if they have a good solution to this...
I'd say so.