On Mon, Apr 1, 2013 at 8:22 PM, Jeff King <[email protected]> wrote:
> On Mon, Apr 01, 2013 at 05:58:55PM -0600, Felipe Contreras wrote:
>> Are you saying that even if we have stored the result of a waitpid
>> command, if errno is EINTR, then we should still loop waitpid()? If
>> so, I guess this would do the trick:
>
> Yes, I think that would work. Though I wonder if it is even worth
> storing EINTR at all in the first place. It tells us nothing. In fact,
> does storing any error condition really tell us anything?
Probably not, I just tried to minimize the potential behavior changes.
> The two states
> we are interested in at this point are:
>
> 1. We have reaped the child via waitpid; here is its status.
>
> 2. We have not (either we did not try, it was not dead yet, or we were
> not able to due to an error). We should now try it again.
>
> If we got EINTR the first time around, we would likely get the "real"
> answer this time. If we get anything else (like EINVAL or ECHILD), then
> we would get the same thing again calling waitpid() later.
>
>> > We now take argv0 into wait_or_whine. But I don't see it being used.
>> > What's it for?
>>
>> It was there before:
>> -static int wait_or_whine(pid_t pid, const char *argv0)
>> +static int wait_or_whine(struct child_process *cmd, pid_t pid, const
>> char *argv0)
>
> Ah, sorry, I misread the diff. We are adding "cmd", not "argv0".
Yeah, which in fact was already there before.
> That would trigger the rest of your code in the error case, which I
> think was your original intent. But then we return "0" from
> check_command. Is that right?
>
> There are three states we can be in from calling waitpid:
>
> 1. The process is dead.
>
> 2. The process is not dead.
>
> 3. We could not determine which because waitpid returned an error.
>
> It is clear that check_command is trying to tell its caller (1) or (2);
> but what should it say in case of (3)?
>
> Naively, given how patch 2 uses it, I think it would actually make sense
> for it to return 1. That is, the semantics are "return 0 if and only if
> the pid is verified to be dead; otherwise return 1".
I thought if there was an error that constituted a failure.
> But if we know from reading waitpid(3) that waitpid should only fail due
> to EINTR, or due to bogus arguments (e.g., a pid that does not exist or
> has already been reaped), then maybe something like this makes sense:
>
> while ((waiting = waitpid(pid, &status, 0)) < 0 && errno == EINTR)
> ; /* nothing */
But we don't want to wait synchronously here, we just want to ping.
> /* pid definitely still going */
> if (!waiting)
> return 1;
>
> /* pid definitely died */
> if (waiting == cmd->pid) {
> cmd->last_status.valid = 1;
> cmd->last_status.status = status;
> return 0;
> }
>
> /*
> * this should never happen, since we handed waitpid() a single
> * pid, so it should either return that pid, 0, or an error.
> */
> if (waiting > 0)
> die("BUG: waitpid reported a random pid?");
>
> /*
> * otherwise, we have an error. Assume the pid is gone, since that
> * is the only reason for waitpid to report a problem besides EINTR.
> * We don't bother recording errno, since we can just repeat
> * the waitpid again later.
> */
> return 0;
The rest makes sense.
>> >> + cmd->last_wait.code = -1;
>> >> + cmd->last_wait.failed_errno = failed_errno;
>> >> + cmd->last_wait.status = status;
>> >
>> > Since we can only get here when waiting == cmd->pid,
>>
>> No, also when waiting < 0.
>
> After the fix above, yes; in the original we would always have exited
> already.
No:
+ if (waiting != cmd->pid)
+ return 1;
If waiting < 0, waiting != cmd->pid, and therefore this return is not
triggered, and there's only one more return at the end of the
function.
Cheers.
--
Felipe Contreras
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html