Re: [PATCHv2] run-command: detect finished children by closed pipe rather than waitpid
On Fri, Nov 20, 2015 at 10:58 PM, Torsten Bögershausenwrote: > On 2015-11-20 22.08, Stefan Beller wrote: > The patch looks good at first glance, one minor remark below: >> >> diff --git a/run-command.c b/run-command.c > >> @@ -1071,70 +1089,31 @@ static void pp_output(struct parallel_processes *pp) >> >> static int pp_collect_finished(struct parallel_processes *pp) >> { >> - int i = 0; >> - pid_t pid; >> - int wait_status, code; >> + int i, code; > > code is probably "return code"? > woud "ret_value", "res" or "rc" make that more clear ? > The return value of invoked process, yes. "ret_value", "res" sound too much like our own future return value (by convention of naming its own return value ret_value or similar). Instead of return code, I may settle with `exit_status`? -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCHv2] run-command: detect finished children by closed pipe rather than waitpid
On Mon, Nov 23, 2015 at 10:44 AM, Stefan Bellerwrote: > On Fri, Nov 20, 2015 at 10:58 PM, Torsten Bögershausen wrote: >> On 2015-11-20 22.08, Stefan Beller wrote: >> The patch looks good at first glance, one minor remark below: >>> >>> diff --git a/run-command.c b/run-command.c >> >>> @@ -1071,70 +1089,31 @@ static void pp_output(struct parallel_processes *pp) >>> >>> static int pp_collect_finished(struct parallel_processes *pp) >>> { >>> - int i = 0; >>> - pid_t pid; >>> - int wait_status, code; >>> + int i, code; >> >> code is probably "return code"? >> woud "ret_value", "res" or "rc" make that more clear ? >> > Although looking through the code, we have lots of functions having a local `code` variable, so we may want to preserve consistency across the different functions to have a `code`which contains the return value of the process or function invoked. We had the `code` already in pp_collect_finished, so I'd like to not rename a variable (which was used for the same purpose) in this patch. In pp_start_one we introduce a new variable `code` which contains the return from the user callback function, so I would understand if we were arguing there. That said, I plan to resend with a reworded commit message later today. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCHv2] run-command: detect finished children by closed pipe rather than waitpid
Detect if a child stopped working by checking if their stderr pipe was closed instead of checking their state with waitpid. As waitpid is not fully working in Windows, this is an approach which allows for better cross platform operation. (It's less code, too) Previously we did not close the read pipe of finished children, which we do now. The old way missed some messages on an early abort. We just killed the children and did not bother to look what was left over. With this approach we'd send a signal to the children and wait for them to close the pipe to have all the messages (including possible "killed by signal 15" messages). To have the test suite passing as before, we allow for real graceful abortion now. In case the user wishes to abort parallel execution the user needs to provide either the signal used to kill all children or the children are let run until they finish normally. Signed-off-by: Stefan Beller--- This applies on top of peff/sb/submodule-parallel-fetch. It is a resend without modification from Nov. 11th. run-command.c | 141 +++-- run-command.h | 12 +++-- submodule.c| 3 -- test-run-command.c | 3 -- 4 files changed, 69 insertions(+), 90 deletions(-) diff --git a/run-command.c b/run-command.c index 07424e9..db4d916 100644 --- a/run-command.c +++ b/run-command.c @@ -858,6 +858,12 @@ int capture_command(struct child_process *cmd, struct strbuf *buf, size_t hint) return finish_command(cmd); } +enum child_state { + GIT_CP_FREE, + GIT_CP_WORKING, + GIT_CP_WAIT_CLEANUP, +}; + static struct parallel_processes { void *data; @@ -869,7 +875,7 @@ static struct parallel_processes { task_finished_fn task_finished; struct { - unsigned in_use : 1; + enum child_state state; struct child_process process; struct strbuf err; void *data; @@ -923,7 +929,7 @@ static void kill_children(struct parallel_processes *pp, int signo) int i, n = pp->max_processes; for (i = 0; i < n; i++) - if (pp->children[i].in_use) + if (pp->children[i].state == GIT_CP_WORKING) kill(pp->children[i].process.pid, signo); } @@ -967,7 +973,7 @@ static struct parallel_processes *pp_init(int n, for (i = 0; i < n; i++) { strbuf_init(>children[i].err, 0); child_process_init(>children[i].process); - pp->pfd[i].events = POLLIN; + pp->pfd[i].events = POLLIN | POLLHUP; pp->pfd[i].fd = -1; } sigchain_push_common(handle_children_on_signal); @@ -1000,39 +1006,46 @@ static void pp_cleanup(struct parallel_processes *pp) * 0 if a new task was started. * 1 if no new jobs was started (get_next_task ran out of work, non critical *problem with starting a new command) - * -1 no new job was started, user wishes to shutdown early. + * <0 no new job was started, user wishes to shutdown early. Use negative code + *to signal the children. */ static int pp_start_one(struct parallel_processes *pp) { - int i; + int i, code; for (i = 0; i < pp->max_processes; i++) - if (!pp->children[i].in_use) + if (pp->children[i].state == GIT_CP_FREE) break; if (i == pp->max_processes) die("BUG: bookkeeping is hard"); - if (!pp->get_next_task(>children[i].data, - >children[i].process, - >children[i].err, - pp->data)) { + code = pp->get_next_task(>children[i].data, +>children[i].process, +>children[i].err, +pp->data); + if (!code) { strbuf_addbuf(>buffered_output, >children[i].err); strbuf_reset(>children[i].err); return 1; } + pp->children[i].process.err = -1; + pp->children[i].process.stdout_to_stderr = 1; + pp->children[i].process.no_stdin = 1; if (start_command(>children[i].process)) { - int code = pp->start_failure(>children[i].process, ->children[i].err, -pp->data, ->children[i].data); + code = pp->start_failure(>children[i].process, +>children[i].err, +pp->data, +>children[i].data); strbuf_addbuf(>buffered_output, >children[i].err); strbuf_reset(>children[i].err); - return code ? -1 : 1; + if (code) + pp->shutdown =
Re: [PATCHv2] run-command: detect finished children by closed pipe rather than waitpid
Stefan Beller wrote: > Detect if a child stopped working by checking if their stderr pipe > was closed instead of checking their state with waitpid. > As waitpid is not fully working in Windows, this is an approach which > allows for better cross platform operation. (It's less code, too) Can you say more about what is broken about waitpid on Windows? I ask because it's possible for a child to close stderr without intending to be finished. That might be okay here (though the commit subject doesn't explain so, it is only affecting the workqueue interface that runs commands in parallel and not the normal run-command interface) but would need some documentation and could be counterintuitive. So if there's a simple way to get waitpid to work, that seems preferrable. Thanks, Jonathan -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCHv2] run-command: detect finished children by closed pipe rather than waitpid
Am 20.11.2015 um 23:05 schrieb Jonathan Nieder: Stefan Beller wrote: Detect if a child stopped working by checking if their stderr pipe was closed instead of checking their state with waitpid. As waitpid is not fully working in Windows, this is an approach which allows for better cross platform operation. (It's less code, too) Can you say more about what is broken about waitpid on Windows? waitpid(-1, ...) is not implemented on Windows. Is it necessary to mention waitpid here at all? The most compelling reason to write the infrastructure in this new way is that it is much more in line with the usual "start_command, read-until-EOF, finish_command" sequence. I ask because it's possible for a child to close stderr without intending to be finished. That might be okay here (though the commit subject doesn't explain so, it is only affecting the workqueue interface that runs commands in parallel and not the normal run-command interface) but would need some documentation and could be counterintuitive. It could be spelled out more clearly. The children have both their stdin and stdout redirected to the same writable end. On the parent side, we have to deal only with "stderr" simply because our child_process facility has everything to redirect like ">&2" (the stdout_to_stderr flag), but nothing for "2>&1". Yeah, it's still possible that the child closes both stdout and stderr long before it dies, but that would really be far, far outside the norm. So if there's a simple way to get waitpid to work, that seems preferrable. It would be possible, but not simple, to make waitpid on Windows suitable for the original code, but that does not make the original code preferrable. -- Hannes -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCHv2] run-command: detect finished children by closed pipe rather than waitpid
On 2015-11-20 22.08, Stefan Beller wrote: The patch looks good at first glance, one minor remark below: > > diff --git a/run-command.c b/run-command.c > @@ -1071,70 +1089,31 @@ static void pp_output(struct parallel_processes *pp) > > static int pp_collect_finished(struct parallel_processes *pp) > { > - int i = 0; > - pid_t pid; > - int wait_status, code; > + int i, code; code is probably "return code"? woud "ret_value", "res" or "rc" make that more clear ? -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html