Hi,
Stefan Beller wrote:
> --- a/Documentation/diff-options.txt
> +++ b/Documentation/diff-options.txt
> @@ -91,14 +91,18 @@ appearing as a deletion or addition in the output. It
> uses the "patience
> diff" algorithm internally.
>
> --diff-algorithm={patience|minimal|histogram|myers}::
> - Choose a diff algorithm. The variants are as follows:
> + Choose a diff algorithm. See the discussion of DIFF ALGORITHMS
> +ifndef::git-diff[]
> + in linkgit:git-diff[1]
> +endif::git-diff[]
> + . The variants are as follows:
This means outside of git-diff(1), I'd see
See the discussion of DIFF ALGORITHMS in git-diff(1) .
And in git-diff(1), I'd see
See the discussion of DIFF ALGORITHMS .
Both don't seem quite right, since they have an extra space before the
period. The git-diff(1) seems especially not quite right --- does it
intend to say something like "See the DIFF ALGORITHMS section for more
discussion"?
[...]
> --- a/Documentation/git-diff.txt
> +++ b/Documentation/git-diff.txt
> @@ -119,6 +119,40 @@ include::diff-options.txt[]
>
> include::diff-format.txt[]
>
> +DIFF ALGORITHMS
> +---------------
Please add some introductory words about what the headings refer to.
> +`Myers`
> +
> +A diff as produced by the basic greedy algorithm described in
> +link:http://www.xmailserver.org/diff2.pdf[An O(ND) Difference Algorithm and
> its Variations].
> +with a run time of O(M + N + D^2). It employs a heuristic to allow for
> +a faster diff at the small cost of diff size.
> +The `minimal` algorithm has that heuristic turned off.
> +
> +`Patience`
> +
> +This algorithm by Bram Cohen matches the longest common subsequence
> +of unique lines on both sides, recursively. It obtained its name by
> +the way the longest subsequence is found, as that is a byproduct of
> +the patience sorting algorithm. If there are no unique lines left
> +it falls back to `myers`. Empirically this algorithm produces
> +a more readable output for code, but it does not garantuee
nit: s/garantuee/guarantee/
> +the shortest output.
Trivia: the `minimal` variant of Myers doesn't guarantee shortest
output, either: what it minimizes is the number of lines marked as
added or removed. If you want to minimize context lines too, then
that would be a new variant. ;-)
[...]
> +`Histogram`
> +
> +This algorithm finds the longest common substring and recursively
> +diffs the content before and after the longest common substring.
optional: may be worth a short aside in the text about the distinction
between LCS and LCS. ;-)
It would be especially useful here, since the alphabet used in these
strings is *lines* instead of characters, so the first-time reader
could probably use some help in building their intuition.
> +If there are no common substrings left, fallback to `myers`.
nit: fallback is the noun, fall back is the verb.
> +This is often the fastest, but in corner cases (when there are
> +many common substrings of the same length) it produces bad
Can you clarify what "bad" means? E.g. would "unexpected", or "poorly
aligned", match what you mean?
> +results as seen in:
> +
> + seq 1 100 >one
> + echo 99 > two
> + seq 1 2 98 >>two
> + git diff --no-index --histogram one two
> +
> EXAMPLES
> --------
Thanks,
Jonathan