Paul Tan <pyoka...@gmail.com> writes:

> ..., I propose the following requirements for the rewritten code:
>
> 1. No spawning of external git processes. This is to support systems with high
>    ``fork()`` or process creation overhead, and to reduce redundant IO by
>    taking advantage of the internal object, index and configuration cache.

I suspect this may probably be too strict in practice.

True, we should never say "run_command_capture()" just to to read
from "git rev-parse"---we should just call get_sha1() instead.

But for a complex command whose execution itself far outweighs the
cost of forking, I do not think it is fair to say your project
failed if you chose to run_command() it.  For example, it may be
perfectly OK to invoke "git merge" via run_command().

> 3. The resulting builtin should not have wildly different behavior or bugs
>    compared to the shell script.

This on the other hand is way too loose.

The original and the port must behave identically, unless the
difference is fixing bugs in the original.

> Potential difficulties
> =======================
>
> Rewriting code may introduce bugs
> ...

Yes, but that is a reasonable risk you need to manage to gain the
benefit from this project.

> Of course, the downside of following this too strictly is that if there were
> any logical bugs in the original code, or if the original code is unclear, the
> rewritten code would inherit these problems too.

I'd repeat my comment on the 3. above.  Identifying and fixing bugs
is great, but otherwise don't worry about this too much.

Being bug-to-bug compatible with the original is way better than
introducing new bugs of an unknown nature.

> Rewritten code may become harder to understand
> ...

And also it may become harder to modify.

That is the largest problem with any rewrite, and we should spend
the most effort to avoid it.

A new bugs introduced we can later fix as long as the result is
understandable and maintainable.

> For the purpose of reducing git's dependencies, the rewritten C code should 
> not
> depend on other libraries or executables other than what is already available
> to git builtins.

Perhaps misphrased; see below.

> We can see that the C version requires much more lines compared to the shell
> pipeline,...

That is something you would solve by introducing reusable code in
run_command API, isn't it?  That is how various rewrites in the past
did, and this project should do so too.  You should aim to do this
project by not just using "what is already available", but adding
what you discover is a useful reusable pattern into a set of new
functions in the "already available" API set.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to