Re: [PATCHv6 0/8] fetch submodules in parallel

2015-10-02 Thread Junio C Hamano
Stefan Beller  writes:

> * renamed return_value_fn to task_finished_fn

It made interdiff noisier but I think it gives us a good end result.

> * the main loop of the parallel processing was first adapted to Junios 
> suggestion,
>   but Jonathan pointed out more improvements.  We can get rid of 
> `no_more_task`
>   completely as `if (!pp->nr_processes)` as the exit condition is sufficient.
>   (pp->nr_processes is modified only when starting or reaping a child, so we 
> will
>   capture the whole output of each subprocess even in case of a quick 
> shutdown)

Interesting.  The original motivation for "no-more-task" check was
that even when we are no longer running anything (i.e. everybody
finished) we may get a new task from next_task(), and the condition
to "break" out of the loop could be placed anywhere in that loop
(e.g. after we wait and cull the finished tasks, or even in the
outermost while(1) condition).

But you can take advantage of the specific placement of the check;
it is after the part that spawns new tasks and before the part that
culls the existing tasks, so not having any running task at that
point is sufficient condition.

Will replace what was queued.

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCHv6 0/8] fetch submodules in parallel

2015-10-01 Thread Ramsay Jones
Hi Stefan,

On 01/10/15 02:54, Stefan Beller wrote:
[snip]

While skimming the interdiff for this series, ...

> diff --git a/run-command.c b/run-command.c
> index df84985..28048a7 100644
> --- a/run-command.c
> +++ b/run-command.c
> @@ -863,12 +863,13 @@ struct parallel_processes {
>  
>   get_next_task_fn get_next_task;
>   start_failure_fn start_failure;
> - return_value_fn return_value;
> + task_finished_fn task_finished;
>  
>   struct {
>   unsigned in_use : 1;
>   struct child_process process;
>   struct strbuf err;
> + void *data;
>   } *children;
>   /*
>* The struct pollfd is logically part of *children,
> @@ -882,9 +883,10 @@ struct parallel_processes {
>   struct strbuf buffered_output; /* of finished children */
>  } parallel_processes_struct;
>  
> -static int default_start_failure(void *data,
> -  struct child_process *cp,
> -  struct strbuf *err)
> +static int default_start_failure(struct child_process *cp,
> +  struct strbuf *err,
> +  void *pp_cb,
> +  void *pp_task_cb)
>  {
>   int i;
>  
> @@ -895,10 +897,11 @@ static int default_start_failure(void *data,
>   return 0;
>  }
>  
> -static int default_return_value(void *data,
> - struct child_process *cp,
> - struct strbuf *err,
> - int result)
> +static int default_task_finished(int result,
> +  struct child_process *cp,
> +  struct strbuf *err,
> +  void *pp_cb,
> +  void *pp_task_cb)
>  {
>   int i;
>  
> @@ -930,10 +933,11 @@ static void handle_children_on_signal(int signo)
>   raise(signo);
>  }
>  
> -static struct parallel_processes *pp_init(int n, void *data,
> +static struct parallel_processes *pp_init(int n,
> get_next_task_fn get_next_task,
> start_failure_fn start_failure,
> -   return_value_fn return_value)
> +   task_finished_fn task_finished,
> +   void *data)
>  {
>   int i;
>   struct parallel_processes *pp = _processes_struct;
> @@ -948,7 +952,7 @@ static struct parallel_processes *pp_init(int n, void 
> *data,
>   pp->get_next_task = get_next_task;
>  
>   pp->start_failure = start_failure ? start_failure : 
> default_start_failure;
> - pp->return_value = return_value ? return_value : default_return_value;
> + pp->task_finished = task_finished ? task_finished : 
> default_task_finished;
>  
>   pp->nr_processes = 0;
>   pp->output_owner = 0;
> @@ -1006,15 +1010,17 @@ static int pp_start_one(struct parallel_processes *pp)
>   if (i == pp->max_processes)
>   die("BUG: bookkeeping is hard");
>  
> - if (!pp->get_next_task(pp->data,
> + if (!pp->get_next_task(>children[i].data,
>  >children[i].process,
> ->children[i].err))
> +>children[i].err,
> +pp->data))
>   return 1;

... the above hunk caught my eye. I don't know that it matters that
much, but since you have reordered parameters on some functions, should
pp->get_next_task() take the 'task_cb' as the last parameter, rather than
the first?

I have not looked at the final result yet (just the interdiff), so please
just ignore the above if I've missed something obvious. :-D

ATB,
Ramsay Jones

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCHv6 0/8] fetch submodules in parallel

2015-10-01 Thread Stefan Beller
On Thu, Oct 1, 2015 at 11:55 AM, Ramsay Jones
 wrote:
> Hi Stefan,
>
>>
>> - if (!pp->get_next_task(pp->data,
>> + if (!pp->get_next_task(>children[i].data,
>>  >children[i].process,
>> ->children[i].err))
>> +>children[i].err,
>> +pp->data))
>>   return 1;
>
> ... the above hunk caught my eye. I don't know that it matters that
> much, but since you have reordered parameters on some functions, should
> pp->get_next_task() take the 'task_cb' as the last parameter, rather than
> the first?
>
> I have not looked at the final result yet (just the interdiff), so please
> just ignore the above if I've missed something obvious. :-D

Well I reordered such that "passive" arguments come last, i.e. the
cookies for consumption. In this specific case we ask get_next_task
to fill in the cookie if desired. Unlike all the other cookie passing
this is a double void pointer, so even syntactically we have a difference
to other cookie passing around.

If you look at the function definitions in the header, you find the arguments
ordered as

  (Active/unique arguments for that function, child process, error
buffer, cookies for consumption)

That said, I find a few things I need to improve.
pp->children[i].data, may want to be initialized to NULL before we ask
get_next_task
to fill in a cookie. If get_next_task decides not to, we have a clear default.

The call to run_processes_parallel may be reordered to its original order again,
as we pass in a cookie actively. (int n, int *cb, callbacks...)

>
> ATB,
> Ramsay Jones
>
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCHv6 0/8] fetch submodules in parallel

2015-09-30 Thread Stefan Beller
This replaces sb/submodule-parallel-fetch once again.
Changes are only in patch 5,6,7
(5: reverse popping, 6: see below, 7: adapt to changes of 6).

Junio wrote:
> > + if (pp->return_value(pp->data, >children[i].process,
> > +  >children[i].err, code))
> at this point, code can be uninitialized if we took the last "is
> confused" arm of the if/elseif cascade.

It's fixed in the reroll.

sigchain_pop_common reversed the popping.

When I started an office discussion with Jonathan about how to best implement
the next step ("git submodule update" using the parallel processing machine),
I fixes some nits and also some major spots:

* The order of the arguments for the callbacks (Generally the callback cookie
  comes last and is called `cb` and not `data`)
  
* renamed return_value_fn to task_finished_fn
  
* Add another callback cookie for task specific things. This will help in the
  rewrite of `git submodule update` as there are steps to be done after the
  some processes are done using the parallel engine. So we want to be able
  to remember specific children or tag information on them instead parsing the
  cp->argv.

* the main loop of the parallel processing was first adapted to Junios 
suggestion,
  but Jonathan pointed out more improvements.  We can get rid of `no_more_task`
  completely as `if (!pp->nr_processes)` as the exit condition is sufficient.
  (pp->nr_processes is modified only when starting or reaping a child, so we 
will
  capture the whole output of each subprocess even in case of a quick shutdown)

* even more accurate documentation

Jonathan Nieder (1):
  submodule.c: write "Fetching submodule " to stderr

Stefan Beller (7):
  xread: poll on non blocking fds
  xread_nonblock: add functionality to read from fds without blocking
  strbuf: add strbuf_read_once to read without blocking
  sigchain: add command to pop all common signals
  run-command: add an asynchronous parallel child processor
  fetch_populated_submodules: use new parallel job processing
  submodules: allow parallel fetching, add tests and documentation

 Documentation/fetch-options.txt |   7 +
 builtin/fetch.c |   6 +-
 builtin/pull.c  |   6 +
 git-compat-util.h   |   1 +
 run-command.c   | 350 
 run-command.h   |  78 +
 sigchain.c  |   9 ++
 sigchain.h  |   1 +
 strbuf.c|  11 ++
 strbuf.h|   9 ++
 submodule.c | 129 +++
 submodule.h |   2 +-
 t/t0061-run-command.sh  |  20 +++
 t/t5526-fetch-submodules.sh |  70 +---
 test-run-command.c  |  25 +++
 wrapper.c   |  35 +++-
 16 files changed, 695 insertions(+), 64 deletions(-)
diff --git a/run-command.c b/run-command.c
index df84985..28048a7 100644
--- a/run-command.c
+++ b/run-command.c
@@ -863,12 +863,13 @@ struct parallel_processes {
 
get_next_task_fn get_next_task;
start_failure_fn start_failure;
-   return_value_fn return_value;
+   task_finished_fn task_finished;
 
struct {
unsigned in_use : 1;
struct child_process process;
struct strbuf err;
+   void *data;
} *children;
/*
 * The struct pollfd is logically part of *children,
@@ -882,9 +883,10 @@ struct parallel_processes {
struct strbuf buffered_output; /* of finished children */
 } parallel_processes_struct;
 
-static int default_start_failure(void *data,
-struct child_process *cp,
-struct strbuf *err)
+static int default_start_failure(struct child_process *cp,
+struct strbuf *err,
+void *pp_cb,
+void *pp_task_cb)
 {
int i;
 
@@ -895,10 +897,11 @@ static int default_start_failure(void *data,
return 0;
 }
 
-static int default_return_value(void *data,
-   struct child_process *cp,
-   struct strbuf *err,
-   int result)
+static int default_task_finished(int result,
+struct child_process *cp,
+struct strbuf *err,
+void *pp_cb,
+void *pp_task_cb)
 {
int i;
 
@@ -930,10 +933,11 @@ static void handle_children_on_signal(int signo)
raise(signo);
 }
 
-static struct parallel_processes *pp_init(int n, void *data,
+static struct parallel_processes *pp_init(int n,
  get_next_task_fn get_next_task,
  start_failure_fn start_failure,
-