Since the KEYWORDS=... assignment is a single line, git struggles to
handle conflicts. When rebasing a series of commits that modify the
KEYWORDS=... it's usually easier to throw them away and reapply on the
new tree than it is to manually handle conflicts during the rebase.

git allows a 'merge driver' program to handle conflicts; this program
handles conflicts in the KEYWORDS=... assignment. E.g., given an ebuild
with these keywords:

KEYWORDS="~alpha ~amd64 ~arm ~arm64 ~hppa ~ia64 ~mips ~ppc ~ppc64 ~riscv ~s390 
~sparc ~x86"

One developer drops the ~alpha keyword and pushes to gentoo.git, and
another developer stabilizes hppa. Without this merge driver, git
requires the second developer to manually resolve the conflict.  With
the custom merge driver, it automatically resolves the conflict.

gentoo.git/.git/config:

        [core]
                ...
                attributesfile = ~/.gitattributes
        [merge "keywords"]
                name = KEYWORDS merge driver
                driver = merge-driver-ekeyword %O %A %B

 ~/.gitattributes:

        *.ebuild merge=keywords

Signed-off-by: Matt Turner <matts...@gentoo.org>
---
One annoying wart in the program is due to the fact that ekeyword
won't work on any file not named *.ebuild. I make a symlink (and set up
an atexit handler to remove it) to work around this. I'm not sure we
could make ekeyword handle arbitrary filenames given its complex multi-
argument parameter support. git merge files are named .merge_file_XXXXX
according to git-unpack-file(1), so we could allow those. Thoughts?

 bin/merge-driver-ekeyword | 125 ++++++++++++++++++++++++++++++++++++++
 1 file changed, 125 insertions(+)
 create mode 100755 bin/merge-driver-ekeyword

diff --git a/bin/merge-driver-ekeyword b/bin/merge-driver-ekeyword
new file mode 100755
index 0000000..6e645a9
--- /dev/null
+++ b/bin/merge-driver-ekeyword
@@ -0,0 +1,125 @@
+#!/usr/bin/python
+#
+# Copyright 2020 Gentoo Authors
+# Distributed under the terms of the GNU General Public License v2 or later
+
+"""
+Custom git merge driver for handling conflicts in KEYWORDS assignments
+
+See https://git-scm.com/docs/gitattributes#_defining_a_custom_merge_driver
+"""
+
+import atexit
+import difflib
+import os
+import shutil
+import sys
+
+from typing import List, Optional, Tuple
+
+from gentoolkit.ekeyword import ekeyword
+
+
+def keyword_array(keyword_line: str) -> List[str]:
+    # Find indices of string inside the double-quotes
+    i1: int = keyword_line.find('"') + 1
+    i2: int = keyword_line.rfind('"')
+
+    # Split into array of KEYWORDS
+    return keyword_line[i1:i2].split(' ')
+
+
+def keyword_line_changes(old: str, new: str) -> List[Tuple[Optional[str],
+                                                           Optional[str]]]:
+    a: List[str] = keyword_array(old)
+    b: List[str] = keyword_array(new)
+
+    s = difflib.SequenceMatcher(a=a, b=b)
+
+    changes = []
+    for tag, i1, i2, j1, j2 in s.opcodes():
+        if tag == 'replace':
+            changes.append((a[i1:i2], b[j1:j2]),)
+        elif tag == 'delete':
+            changes.append((a[i1:i2], None),)
+        elif tag == 'insert':
+            changes.append((None, b[j1:j2]),)
+        else:
+            assert tag == 'equal'
+    return changes
+
+
+def keyword_changes(ebuild1: str, ebuild2: str) -> List[Tuple[Optional[str],
+                                                              Optional[str]]]:
+    with open(ebuild1) as e1, open(ebuild2) as e2:
+        lines1 = e1.readlines()
+        lines2 = e2.readlines()
+
+        diff = difflib.unified_diff(lines1, lines2, n=0)
+        assert next(diff) == '--- \n'
+        assert next(diff) == '+++ \n'
+
+        hunk: int = 0
+        old: str = ''
+        new: str = ''
+
+        for line in diff:
+            if line.startswith('@@ '):
+                if hunk > 0: break
+                hunk += 1
+            elif line.startswith('-'):
+                if old or new: break
+                old = line
+            elif line.startswith('+'):
+                if not old or new: break
+                new = line
+        else:
+            if 'KEYWORDS=' in old and 'KEYWORDS=' in new:
+                return keyword_line_changes(old, new)
+        return None
+
+
+def apply_keyword_changes(ebuild: str,
+                          changes: List[Tuple[Optional[str],
+                                              Optional[str]]]) -> int:
+    # ekeyword will only modify files named *.ebuild, so make a symlink
+    ebuild_symlink = ebuild + '.ebuild'
+    os.symlink(ebuild, ebuild_symlink)
+    atexit.register(lambda: os.remove(ebuild_symlink))
+
+    for removals, additions in changes:
+        args = []
+        for rem in removals:
+            # Drop leading '~' and '-' characters and prepend '^'
+            i = 1 if rem[0] in ('~', '-') else 0
+            args.append('^' + rem[i:])
+        if additions:
+            args.extend(additions)
+        args.append(ebuild_symlink)
+
+        result = ekeyword.main(args)
+        if result != 0:
+            return result
+    return 0
+
+
+def main(argv):
+    if len(argv) != 4:
+        sys.exit(-1)
+
+    O = argv[1] # %O - filename of original
+    A = argv[2] # %A - filename of our current version
+    B = argv[3] # %B - filename of the other branch's version
+
+    # Get changes from %O to %B
+    changes = keyword_changes(O, B)
+    if not changes:
+        sys.exit(-1)
+
+    # Apply O -> B changes to A
+    result: int = apply_keyword_changes(A, changes)
+    sys.exit(result)
+
+
+if __name__ == "__main__":
+    main(sys.argv)
-- 
2.26.2


Reply via email to