On Sun, 11 Aug 2002, Ian Dowse wrote:

> In message <[EMAIL PROTECTED]>, Bruce Evans writes:
> >
> >I don't know how open() of a disk device can be interrupted by a signal
> >in practice.  Most disk operations don't check for signals.
>
> Does the PCATCH tsleep in diskopen() that I mentioned seem a likely
> candidate?

Yikes.  I didn't notice this or that you mentioned it.

> Anyway, below is a simple program that reproduces the
> EINTR error fairly reliably for me when run on disk devices.
>
> #include <sys/types.h>
> #include <err.h>
> #include <fcntl.h>
> #include <signal.h>
> #include <unistd.h>
>
> void
> handler(int sig) {
> }
>
> int
> main(int argc, char **argv)
> {
>       int fd, i;
>       if (argc < 2)
>               errx(1, "Usage: %s device", argv[0]);
>       fork();
>       fork();
>       fork();
>       fork();
>
>       signal(SIGUSR1, handler);
>       sleep(1);
>
>       for (i = 0; i < 200; i++) {
>               killpg(0, SIGUSR1);
>               if ((fd = open(argv[1], O_RDONLY)) < 0)
>                       err(1, "%s", argv[1]);
>               close(fd);
>       }
>       return 0;
> }

This works reliably because the first open takes a relatively long time
to do physical i/o's to probe the MBR, etc.  The other opens wait
interruptibly for the first, so there is a large window for killing them.

I just saw a reply from the original (?) author of the PR.  Apparently
dump gets killed by amanda.  Does amanda actually kill its children
enough to matter?

The primary bug seems to be that open() returns EINTR despite SA_RESTART
being set.  POSIX.1-200x-draft7 seems to say that SA_RESTART affects
_all_ syscalls that can return EINTR except as explicitly noted.
This is not explicitly noted for open(), but there don't seem to be
enough explicit notes -- I could only find one for [p]select().
In POSIX, SA_RESTART's effect on select() and poll() is
implementation-defined and required, respectively, but in FreeBSD
SA_RESTART affects neither and many applications depend on this for
select() at least

POSIX's restarting of syscalls seems to be hard to implement.  POSIX
requires restarting at the point where the interrupt was received, but
FreeBSD unwinds all the way back to syscall() for delivering the signal
and then restarts from there.  The FreeBSD implementation only works
for simple syscalls.  Note that the most important cases of read()
write are not easy to restart from syscall() in general, but we avoid
problems by only restarting them if they haven't done any i/o.  This
is POSIX conformant since we don't return EINTR if they did any i/o,
and POSIX explicitly permits this behaviour although it is inconvienient
for applications (applications still have to deal with short i/o's).

Bruce


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message

Reply via email to