cloud-fan commented on code in PR #56058:
URL: https://github.com/apache/spark/pull/56058#discussion_r3287023406


##########
dev/merge_spark_pr.py:
##########
@@ -506,6 +503,69 @@ def cherry_pick(pr_num, merge_hash, default_branch):
     return pick_ref
 
 
+def cherry_pick(pr_num, merge_hash, default_branch, branch_names, 
already_picked=()):
+    """Prompt for a target branch and cherry-pick `merge_hash` onto it.
+
+    Enforces the Upstream-First policy (see header comment): when the 
committer types a
+    branch-M.N target while branch-M.x is also a known release branch AND has 
not already
+    received this commit, prompt to confirm whether to pick into BOTH (the 
policy-compliant
+    default) or branch-M.N only (treated as a maintenance-only bugfix). 
Returns the list of
+    refs actually picked into, so the main loop can advance its 
remaining-branches list
+    correctly.
+    """
+    pick_ref = bold_input("Enter a branch name [%s]: " % default_branch)
+    if pick_ref == "":
+        pick_ref = default_branch
+
+    # Upstream-First check: if the committer typed branch-M.N and branch-M.x 
exists as a
+    # sibling release branch that has not yet received this commit, surface 
the policy and
+    # offer to pick both in one step. Skipping when sibling_x is the merge 
target or already
+    # in already_picked avoids a redundant prompt / failing re-cherry-pick.
+    sibling_x = None
+    m = re.match(r"^branch-(\d+)\.(\d+)$", pick_ref)

Review Comment:
   The policy gates on the cherry-pick target, but conceptually it should gate 
on the PR base (`target_ref`). The committer can only skip `branch-M.x` when 
the PR was opened against `master`:
   
   - `target_ref == "master"` → policy is meaningful (committer might type 
`branch-M.N` and bypass `branch-M.x`).
   - `target_ref == "branch-M.x"` → moot; the merge itself lands on 
`branch-M.x` so it can't be skipped.
   - `target_ref == "branch-M.N"` → wrong to fire; the author already chose 
per-branch scope by opening a separated backport PR.
   
   The current code happens to suppress the M.x case via `already_picked` 
(because `merged_refs` is pre-seeded with `target_ref`), but that's incidental 
— `already_picked`'s real job is in-loop suppression within a master-based 
merge (committer picks `branch-M.x` first, then `branch-M.N`, don't re-prompt). 
The M.N case isn't covered at all and will produce false-positive prompts on 
cross-`branch-M.N` cherry-picks.
   
   `target_ref` is already in scope at both call sites (set at line 946). 
Suggest plumbing it as a parameter and short-circuiting when it isn't 
`"master"`:
   
   ```python
   def cherry_pick(pr_num, merge_hash, default_branch, branch_names, 
target_ref, already_picked=()):
       ...
       sibling_x = None
       if target_ref == "master":
           m = re.match(r"^branch-(\d+)\.(\d+)$", pick_ref)
           if m:
               candidate = "branch-%s.x" % m.group(1)
               if candidate in branch_names and candidate not in already_picked:
                   sibling_x = candidate
       ...
   ```



##########
dev/merge_spark_pr.py:
##########
@@ -506,6 +503,69 @@ def cherry_pick(pr_num, merge_hash, default_branch):
     return pick_ref
 
 
+def cherry_pick(pr_num, merge_hash, default_branch, branch_names, 
already_picked=()):
+    """Prompt for a target branch and cherry-pick `merge_hash` onto it.
+
+    Enforces the Upstream-First policy (see header comment): when the 
committer types a
+    branch-M.N target while branch-M.x is also a known release branch AND has 
not already
+    received this commit, prompt to confirm whether to pick into BOTH (the 
policy-compliant
+    default) or branch-M.N only (treated as a maintenance-only bugfix). 
Returns the list of
+    refs actually picked into, so the main loop can advance its 
remaining-branches list
+    correctly.
+    """
+    pick_ref = bold_input("Enter a branch name [%s]: " % default_branch)
+    if pick_ref == "":
+        pick_ref = default_branch
+
+    # Upstream-First check: if the committer typed branch-M.N and branch-M.x 
exists as a
+    # sibling release branch that has not yet received this commit, surface 
the policy and
+    # offer to pick both in one step. Skipping when sibling_x is the merge 
target or already
+    # in already_picked avoids a redundant prompt / failing re-cherry-pick.
+    sibling_x = None
+    m = re.match(r"^branch-(\d+)\.(\d+)$", pick_ref)
+    if m:
+        candidate = "branch-%s.x" % m.group(1)
+        if (
+            candidate in branch_names
+            and candidate != pick_ref
+            and candidate not in already_picked
+        ):
+            sibling_x = candidate
+
+    if sibling_x is not None:
+        print()
+        print("=" * 76)

Review Comment:
   Other visual dividers in this file use `"=" * 80` (lines 933, 935, 943, 
945). Use 80 here (and at line 545) for consistency.
   ```suggestion
           print("=" * 80)
   ```



##########
dev/merge_spark_pr.py:
##########
@@ -506,6 +503,69 @@ def cherry_pick(pr_num, merge_hash, default_branch):
     return pick_ref
 
 
+def cherry_pick(pr_num, merge_hash, default_branch, branch_names, 
already_picked=()):
+    """Prompt for a target branch and cherry-pick `merge_hash` onto it.
+
+    Enforces the Upstream-First policy (see header comment): when the 
committer types a
+    branch-M.N target while branch-M.x is also a known release branch AND has 
not already
+    received this commit, prompt to confirm whether to pick into BOTH (the 
policy-compliant
+    default) or branch-M.N only (treated as a maintenance-only bugfix). 
Returns the list of
+    refs actually picked into, so the main loop can advance its 
remaining-branches list
+    correctly.
+    """
+    pick_ref = bold_input("Enter a branch name [%s]: " % default_branch)
+    if pick_ref == "":
+        pick_ref = default_branch
+
+    # Upstream-First check: if the committer typed branch-M.N and branch-M.x 
exists as a
+    # sibling release branch that has not yet received this commit, surface 
the policy and
+    # offer to pick both in one step. Skipping when sibling_x is the merge 
target or already
+    # in already_picked avoids a redundant prompt / failing re-cherry-pick.
+    sibling_x = None
+    m = re.match(r"^branch-(\d+)\.(\d+)$", pick_ref)
+    if m:
+        candidate = "branch-%s.x" % m.group(1)
+        if (
+            candidate in branch_names
+            and candidate != pick_ref
+            and candidate not in already_picked
+        ):
+            sibling_x = candidate
+
+    if sibling_x is not None:
+        print()
+        print("=" * 76)
+        print(
+            "Upstream-First policy: non-bugfix commits on %s should also land 
on %s."
+            % (pick_ref, sibling_x)
+        )
+        print("If this is a %s-only maintenance bugfix, you may pick %s alone."
+              % (pick_ref, pick_ref))
+        print("Otherwise, pick both (%s first, then %s)." % (sibling_x, 
pick_ref))
+        print("=" * 76)
+        choice = (
+            bold_input(
+                "Pick into [b]oth %s + %s / [o]nly %s / [a]bort (default: 
both): "
+                % (sibling_x, pick_ref, pick_ref)
+            )
+            .strip()
+            .lower()
+        )
+        if choice in ("", "b", "both"):
+            picked_x = _do_cherry_pick(pr_num, merge_hash, sibling_x)
+            picked_n = _do_cherry_pick(pr_num, merge_hash, pick_ref)
+            return [picked_x, picked_n]
+        elif choice in ("o", "only"):
+            return [_do_cherry_pick(pr_num, merge_hash, pick_ref)]
+        elif choice in ("a", "abort"):
+            clean_up()
+            fail("Aborted by user at Upstream-First policy prompt.")

Review Comment:
   `fail()` already calls `clean_up()` (line 365), so this produces a double 
"Restoring head pointer to..." print. The `else: fail(...)` branch directly 
below doesn't pre-call cleanup; align them.
   ```suggestion
           elif choice in ("a", "abort"):
               fail("Aborted by user at Upstream-First policy prompt.")
   ```



##########
dev/merge_spark_pr.py:
##########
@@ -506,6 +503,69 @@ def cherry_pick(pr_num, merge_hash, default_branch):
     return pick_ref
 
 
+def cherry_pick(pr_num, merge_hash, default_branch, branch_names, 
already_picked=()):
+    """Prompt for a target branch and cherry-pick `merge_hash` onto it.
+
+    Enforces the Upstream-First policy (see header comment): when the 
committer types a
+    branch-M.N target while branch-M.x is also a known release branch AND has 
not already
+    received this commit, prompt to confirm whether to pick into BOTH (the 
policy-compliant
+    default) or branch-M.N only (treated as a maintenance-only bugfix). 
Returns the list of
+    refs actually picked into, so the main loop can advance its 
remaining-branches list
+    correctly.
+    """
+    pick_ref = bold_input("Enter a branch name [%s]: " % default_branch)
+    if pick_ref == "":
+        pick_ref = default_branch
+
+    # Upstream-First check: if the committer typed branch-M.N and branch-M.x 
exists as a
+    # sibling release branch that has not yet received this commit, surface 
the policy and
+    # offer to pick both in one step. Skipping when sibling_x is the merge 
target or already
+    # in already_picked avoids a redundant prompt / failing re-cherry-pick.
+    sibling_x = None
+    m = re.match(r"^branch-(\d+)\.(\d+)$", pick_ref)
+    if m:
+        candidate = "branch-%s.x" % m.group(1)
+        if (
+            candidate in branch_names
+            and candidate != pick_ref
+            and candidate not in already_picked
+        ):
+            sibling_x = candidate

Review Comment:
   The PR notes no tests cover the new policy logic. The check here is pure (no 
I/O), so it could move into a small helper that's doctest-able without mocking 
`bold_input`:
   
   ```python
   def _upstream_first_sibling(pick_ref, branch_names, already_picked):
       """Return the sibling branch-M.x if Upstream-First should prompt, else 
None.
   
       >>> _upstream_first_sibling("branch-4.2", ["branch-4.x", "branch-4.2"], 
())
       'branch-4.x'
       >>> _upstream_first_sibling("branch-4.2", ["branch-4.x", "branch-4.2"], 
("branch-4.x",))
       >>> _upstream_first_sibling("branch-4.x", ["branch-4.x"], ())
       >>> _upstream_first_sibling("branch-4.99", ["branch-4.2"], ())
       """
       m = re.match(r"^branch-(\d+)\.(\d+)$", pick_ref)
       if not m:
           return None
       candidate = "branch-%s.x" % m.group(1)
       if candidate in branch_names and candidate not in already_picked:
           return candidate
       return None
   ```



##########
dev/merge_spark_pr.py:
##########
@@ -506,6 +503,69 @@ def cherry_pick(pr_num, merge_hash, default_branch):
     return pick_ref
 
 
+def cherry_pick(pr_num, merge_hash, default_branch, branch_names, 
already_picked=()):
+    """Prompt for a target branch and cherry-pick `merge_hash` onto it.
+
+    Enforces the Upstream-First policy (see header comment): when the 
committer types a
+    branch-M.N target while branch-M.x is also a known release branch AND has 
not already
+    received this commit, prompt to confirm whether to pick into BOTH (the 
policy-compliant
+    default) or branch-M.N only (treated as a maintenance-only bugfix). 
Returns the list of
+    refs actually picked into, so the main loop can advance its 
remaining-branches list
+    correctly.
+    """
+    pick_ref = bold_input("Enter a branch name [%s]: " % default_branch)
+    if pick_ref == "":
+        pick_ref = default_branch
+
+    # Upstream-First check: if the committer typed branch-M.N and branch-M.x 
exists as a
+    # sibling release branch that has not yet received this commit, surface 
the policy and
+    # offer to pick both in one step. Skipping when sibling_x is the merge 
target or already
+    # in already_picked avoids a redundant prompt / failing re-cherry-pick.
+    sibling_x = None
+    m = re.match(r"^branch-(\d+)\.(\d+)$", pick_ref)
+    if m:
+        candidate = "branch-%s.x" % m.group(1)
+        if (
+            candidate in branch_names
+            and candidate != pick_ref
+            and candidate not in already_picked
+        ):

Review Comment:
   `candidate != pick_ref` is unreachable as false: the regex on line 525 
constrains `pick_ref` to `branch-M.N` where N is a digit, while `candidate` is 
`branch-M.x` (literal `x`). They can never be equal.
   ```suggestion
           if (
               candidate in branch_names
               and candidate not in already_picked
           ):
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to