Re: [PATCH 03/10] run-command: add an asynchronous parallel child processor

Junio C Hamano Thu, 17 Sep 2015 18:06:58 -0700

Stefan Beller <sbel...@google.com> writes:

> On Thu, Sep 17, 2015 at 2:44 PM, Junio C Hamano <gits...@pobox.com> wrote:
>
>> Hmm, you are relying on the fact that a valid pid can never be 0, so
>> you can just use pp->children[i].child.pid to see if a "slot" is
>> occupied without even using pp->slots[] (or pp->children[i].in_use).
>
> We could either use the pid as accessed via children[i].process.pid
> or we could rely on the children[i].process.err != -1 as start_process
> will set it to an actual fd, and when it's done we reset it to -1.
>
> I am unsure if this make it less readable though (all your other
> suggestions improve readability a lot!)


Sorry, the above was not a suggestion but merely an observation with
a bit of thinking aloud mixed in.  I should have marked it as such
more clearly.

>> ... and then picks a new owner of the output channel.
>>
>> Up to this point it looks sensible.
>>
>>> +                     fputs(pp->err[pp->foreground_child].buf, stderr);
>>> +                     strbuf_reset(&pp->err[pp->foreground_child]);
>>
>> I do not think these two lines need to be here, especially if you
>> follow the above advice of separating buffering and draining.
>
> They are outputting the buffer, if the next selected child is still running.
> I mean it would output eventually anyway, but it would only output after
> the next start of processes and poll() is done. Yeah maybe that's what we
> want (starting new processes earlier is better than outputting earlier as 
> we're
> talking microseconds here).

I am not talking about microseconds but refraining from doing the
same thing in multiple places.

>> But calling start_new() unconditionally feels sloppy.  It should at
>> least be something like
>>
>>         if (pp.nr_processes < pp.max_processes &&
>>             !pp.all_task_started)
>>                 start_new_process()
>>
>> no?
>
> That's the exact condition we have in start_new_process
> so I don't want to repeat it here?

You are advocating to have a function whose definition is "this may
or may not start a new process and the caller should not care", and
then make the caller keep calling it, knowing/hoping that the caller
does not care if we spawn a new thing or not.

I somehow find it a highly questionable design taste to base the
decision on "don't want to repeat it here".

Stepping back and thinking about it, what is suggested is more
explicit in the caller.  "If we know we need a new worker and we can
have a new worker, then start it." is the logic in the caller in the
suggested structure, and we would have a well defined helper whose
sole task is to start a new worker to be called from the caller.
The helper may want to check if the request to start a new one makes
sense (e.g. if slots[] are all full, it may even want to return an
error), but the checks are primarily for error checking (i.e. "we
can have max N processes, so make sure we do not exceed that limit"),
not a semantic one (i.e. the caller _could_ choose to spawn less
than that max when there is a good reason to do so).


--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 03/10] run-command: add an asynchronous parallel child processor

Reply via email to