Add some documentation for the logic behind the conflict normalization
in rerere.  Also describe a bug that happens because we just linearly
scan for conflict markers.

Signed-off-by: Thomas Gummerer <t.gumme...@gmail.com>
---

This documents my understanding of the rerere conflict normalization
and conflict ID computation logic.  Writing this down helped me
understand the logic, and I thought it may be useful to have this as
documentation in Documentation/technical as well.  Junio: as you wrote
the original NEEDSWORK comment, did you have something more in mind
here that should be documented?

 Documentation/technical/rerere.txt | 43 ++++++++++++++++++++++++++++++
 rerere.c                           |  4 ---
 2 files changed, 43 insertions(+), 4 deletions(-)
 create mode 100644 Documentation/technical/rerere.txt

diff --git a/Documentation/technical/rerere.txt 
b/Documentation/technical/rerere.txt
new file mode 100644
index 0000000000..94cc6a7ef0
--- /dev/null
+++ b/Documentation/technical/rerere.txt
@@ -0,0 +1,43 @@
+Rerere
+======
+
+This document describes the rerere logic.
+
+Conflict normalization
+----------------------
+
+To try and re-do a conflict resolution, even when different merge
+strategies are used, 'rerere' computes a conflict ID for each
+conflict in the file.
+
+This is done by discarding the common ancestor version in the
+diff3-style, and re-ordering the two sides of the conflict, in
+alphabetic order.
+
+Using this technique a conflict that looks as follows when for example
+'master' was merged into a topic branch:
+
+    <<<<<<< HEAD
+    foo
+    =======
+    bar
+    >>>>>>> master
+
+and the opposite way when the topic branch is merged into 'master':
+
+    <<<<<<< HEAD
+    bar
+    =======
+    foo
+    >>>>>>> topic
+
+can be recognized as the same conflict, and can automatically be
+re-resolved by 'rerere', as the SHA-1 sum of the two conflicts would
+be calculated from 'bar<NUL>foo<NUL>' in both cases.
+
+If there are multiple conflicts in one file, they are all appended to
+one another, both in the 'preimage' file as well as in the conflict
+ID.
+
+This is currently implemented by simply scanning through the file and
+looking for conflict markers.
diff --git a/rerere.c b/rerere.c
index af5e6179a9..a02a38e072 100644
--- a/rerere.c
+++ b/rerere.c
@@ -394,10 +394,6 @@ static int is_cmarker(char *buf, int marker_char, int 
marker_size)
  * and NUL concatenated together.
  *
  * Return the number of conflict hunks found.
- *
- * NEEDSWORK: the logic and theory of operation behind this conflict
- * normalization may deserve to be documented somewhere, perhaps in
- * Documentation/technical/rerere.txt.
  */
 static int handle_path(unsigned char *sha1, struct rerere_io *io, int 
marker_size)
 {
-- 
2.17.0.588.g4d217cdf8e.dirty

Reply via email to