cloud-fan commented on code in PR #56058:
URL: https://github.com/apache/spark/pull/56058#discussion_r3287023406
##########
dev/merge_spark_pr.py:
##########
@@ -506,6 +503,69 @@ def cherry_pick(pr_num, merge_hash, default_branch):
return pick_ref
+def cherry_pick(pr_num, merge_hash, default_branch, branch_names,
already_picked=()):
+ """Prompt for a target branch and cherry-pick `merge_hash` onto it.
+
+ Enforces the Upstream-First policy (see header comment): when the
committer types a
+ branch-M.N target while branch-M.x is also a known release branch AND has
not already
+ received this commit, prompt to confirm whether to pick into BOTH (the
policy-compliant
+ default) or branch-M.N only (treated as a maintenance-only bugfix).
Returns the list of
+ refs actually picked into, so the main loop can advance its
remaining-branches list
+ correctly.
+ """
+ pick_ref = bold_input("Enter a branch name [%s]: " % default_branch)
+ if pick_ref == "":
+ pick_ref = default_branch
+
+ # Upstream-First check: if the committer typed branch-M.N and branch-M.x
exists as a
+ # sibling release branch that has not yet received this commit, surface
the policy and
+ # offer to pick both in one step. Skipping when sibling_x is the merge
target or already
+ # in already_picked avoids a redundant prompt / failing re-cherry-pick.
+ sibling_x = None
+ m = re.match(r"^branch-(\d+)\.(\d+)$", pick_ref)
Review Comment:
The policy gates on the cherry-pick target, but conceptually it should gate
on the PR base (`target_ref`). The committer can only skip `branch-M.x` when
the PR was opened against `master`:
- `target_ref == "master"` → policy is meaningful (committer might type
`branch-M.N` and bypass `branch-M.x`).
- `target_ref == "branch-M.x"` → moot; the merge itself lands on
`branch-M.x` so it can't be skipped.
- `target_ref == "branch-M.N"` → wrong to fire; the author already chose
per-branch scope by opening a separated backport PR.
The current code happens to suppress the M.x case via `already_picked`
(because `merged_refs` is pre-seeded with `target_ref`), but that's incidental
— `already_picked`'s real job is in-loop suppression within a master-based
merge (committer picks `branch-M.x` first, then `branch-M.N`, don't re-prompt).
The M.N case isn't covered at all and will produce false-positive prompts on
cross-`branch-M.N` cherry-picks.
`target_ref` is already in scope at both call sites (set at line 946).
Suggest plumbing it as a parameter and short-circuiting when it isn't
`"master"`:
```python
def cherry_pick(pr_num, merge_hash, default_branch, branch_names,
target_ref, already_picked=()):
...
sibling_x = None
if target_ref == "master":
m = re.match(r"^branch-(\d+)\.(\d+)$", pick_ref)
if m:
candidate = "branch-%s.x" % m.group(1)
if candidate in branch_names and candidate not in already_picked:
sibling_x = candidate
...
```
##########
dev/merge_spark_pr.py:
##########
@@ -506,6 +503,69 @@ def cherry_pick(pr_num, merge_hash, default_branch):
return pick_ref
+def cherry_pick(pr_num, merge_hash, default_branch, branch_names,
already_picked=()):
+ """Prompt for a target branch and cherry-pick `merge_hash` onto it.
+
+ Enforces the Upstream-First policy (see header comment): when the
committer types a
+ branch-M.N target while branch-M.x is also a known release branch AND has
not already
+ received this commit, prompt to confirm whether to pick into BOTH (the
policy-compliant
+ default) or branch-M.N only (treated as a maintenance-only bugfix).
Returns the list of
+ refs actually picked into, so the main loop can advance its
remaining-branches list
+ correctly.
+ """
+ pick_ref = bold_input("Enter a branch name [%s]: " % default_branch)
+ if pick_ref == "":
+ pick_ref = default_branch
+
+ # Upstream-First check: if the committer typed branch-M.N and branch-M.x
exists as a
+ # sibling release branch that has not yet received this commit, surface
the policy and
+ # offer to pick both in one step. Skipping when sibling_x is the merge
target or already
+ # in already_picked avoids a redundant prompt / failing re-cherry-pick.
+ sibling_x = None
+ m = re.match(r"^branch-(\d+)\.(\d+)$", pick_ref)
+ if m:
+ candidate = "branch-%s.x" % m.group(1)
+ if (
+ candidate in branch_names
+ and candidate != pick_ref
+ and candidate not in already_picked
+ ):
+ sibling_x = candidate
+
+ if sibling_x is not None:
+ print()
+ print("=" * 76)
Review Comment:
Other visual dividers in this file use `"=" * 80` (lines 933, 935, 943,
945). Use 80 here (and at line 545) for consistency.
```suggestion
print("=" * 80)
```
##########
dev/merge_spark_pr.py:
##########
@@ -506,6 +503,69 @@ def cherry_pick(pr_num, merge_hash, default_branch):
return pick_ref
+def cherry_pick(pr_num, merge_hash, default_branch, branch_names,
already_picked=()):
+ """Prompt for a target branch and cherry-pick `merge_hash` onto it.
+
+ Enforces the Upstream-First policy (see header comment): when the
committer types a
+ branch-M.N target while branch-M.x is also a known release branch AND has
not already
+ received this commit, prompt to confirm whether to pick into BOTH (the
policy-compliant
+ default) or branch-M.N only (treated as a maintenance-only bugfix).
Returns the list of
+ refs actually picked into, so the main loop can advance its
remaining-branches list
+ correctly.
+ """
+ pick_ref = bold_input("Enter a branch name [%s]: " % default_branch)
+ if pick_ref == "":
+ pick_ref = default_branch
+
+ # Upstream-First check: if the committer typed branch-M.N and branch-M.x
exists as a
+ # sibling release branch that has not yet received this commit, surface
the policy and
+ # offer to pick both in one step. Skipping when sibling_x is the merge
target or already
+ # in already_picked avoids a redundant prompt / failing re-cherry-pick.
+ sibling_x = None
+ m = re.match(r"^branch-(\d+)\.(\d+)$", pick_ref)
+ if m:
+ candidate = "branch-%s.x" % m.group(1)
+ if (
+ candidate in branch_names
+ and candidate != pick_ref
+ and candidate not in already_picked
+ ):
+ sibling_x = candidate
+
+ if sibling_x is not None:
+ print()
+ print("=" * 76)
+ print(
+ "Upstream-First policy: non-bugfix commits on %s should also land
on %s."
+ % (pick_ref, sibling_x)
+ )
+ print("If this is a %s-only maintenance bugfix, you may pick %s alone."
+ % (pick_ref, pick_ref))
+ print("Otherwise, pick both (%s first, then %s)." % (sibling_x,
pick_ref))
+ print("=" * 76)
+ choice = (
+ bold_input(
+ "Pick into [b]oth %s + %s / [o]nly %s / [a]bort (default:
both): "
+ % (sibling_x, pick_ref, pick_ref)
+ )
+ .strip()
+ .lower()
+ )
+ if choice in ("", "b", "both"):
+ picked_x = _do_cherry_pick(pr_num, merge_hash, sibling_x)
+ picked_n = _do_cherry_pick(pr_num, merge_hash, pick_ref)
+ return [picked_x, picked_n]
+ elif choice in ("o", "only"):
+ return [_do_cherry_pick(pr_num, merge_hash, pick_ref)]
+ elif choice in ("a", "abort"):
+ clean_up()
+ fail("Aborted by user at Upstream-First policy prompt.")
Review Comment:
`fail()` already calls `clean_up()` (line 365), so this produces a double
"Restoring head pointer to..." print. The `else: fail(...)` branch directly
below doesn't pre-call cleanup; align them.
```suggestion
elif choice in ("a", "abort"):
fail("Aborted by user at Upstream-First policy prompt.")
```
##########
dev/merge_spark_pr.py:
##########
@@ -506,6 +503,69 @@ def cherry_pick(pr_num, merge_hash, default_branch):
return pick_ref
+def cherry_pick(pr_num, merge_hash, default_branch, branch_names,
already_picked=()):
+ """Prompt for a target branch and cherry-pick `merge_hash` onto it.
+
+ Enforces the Upstream-First policy (see header comment): when the
committer types a
+ branch-M.N target while branch-M.x is also a known release branch AND has
not already
+ received this commit, prompt to confirm whether to pick into BOTH (the
policy-compliant
+ default) or branch-M.N only (treated as a maintenance-only bugfix).
Returns the list of
+ refs actually picked into, so the main loop can advance its
remaining-branches list
+ correctly.
+ """
+ pick_ref = bold_input("Enter a branch name [%s]: " % default_branch)
+ if pick_ref == "":
+ pick_ref = default_branch
+
+ # Upstream-First check: if the committer typed branch-M.N and branch-M.x
exists as a
+ # sibling release branch that has not yet received this commit, surface
the policy and
+ # offer to pick both in one step. Skipping when sibling_x is the merge
target or already
+ # in already_picked avoids a redundant prompt / failing re-cherry-pick.
+ sibling_x = None
+ m = re.match(r"^branch-(\d+)\.(\d+)$", pick_ref)
+ if m:
+ candidate = "branch-%s.x" % m.group(1)
+ if (
+ candidate in branch_names
+ and candidate != pick_ref
+ and candidate not in already_picked
+ ):
+ sibling_x = candidate
Review Comment:
The PR notes no tests cover the new policy logic. The check here is pure (no
I/O), so it could move into a small helper that's doctest-able without mocking
`bold_input`:
```python
def _upstream_first_sibling(pick_ref, branch_names, already_picked):
"""Return the sibling branch-M.x if Upstream-First should prompt, else
None.
>>> _upstream_first_sibling("branch-4.2", ["branch-4.x", "branch-4.2"],
())
'branch-4.x'
>>> _upstream_first_sibling("branch-4.2", ["branch-4.x", "branch-4.2"],
("branch-4.x",))
>>> _upstream_first_sibling("branch-4.x", ["branch-4.x"], ())
>>> _upstream_first_sibling("branch-4.99", ["branch-4.2"], ())
"""
m = re.match(r"^branch-(\d+)\.(\d+)$", pick_ref)
if not m:
return None
candidate = "branch-%s.x" % m.group(1)
if candidate in branch_names and candidate not in already_picked:
return candidate
return None
```
##########
dev/merge_spark_pr.py:
##########
@@ -506,6 +503,69 @@ def cherry_pick(pr_num, merge_hash, default_branch):
return pick_ref
+def cherry_pick(pr_num, merge_hash, default_branch, branch_names,
already_picked=()):
+ """Prompt for a target branch and cherry-pick `merge_hash` onto it.
+
+ Enforces the Upstream-First policy (see header comment): when the
committer types a
+ branch-M.N target while branch-M.x is also a known release branch AND has
not already
+ received this commit, prompt to confirm whether to pick into BOTH (the
policy-compliant
+ default) or branch-M.N only (treated as a maintenance-only bugfix).
Returns the list of
+ refs actually picked into, so the main loop can advance its
remaining-branches list
+ correctly.
+ """
+ pick_ref = bold_input("Enter a branch name [%s]: " % default_branch)
+ if pick_ref == "":
+ pick_ref = default_branch
+
+ # Upstream-First check: if the committer typed branch-M.N and branch-M.x
exists as a
+ # sibling release branch that has not yet received this commit, surface
the policy and
+ # offer to pick both in one step. Skipping when sibling_x is the merge
target or already
+ # in already_picked avoids a redundant prompt / failing re-cherry-pick.
+ sibling_x = None
+ m = re.match(r"^branch-(\d+)\.(\d+)$", pick_ref)
+ if m:
+ candidate = "branch-%s.x" % m.group(1)
+ if (
+ candidate in branch_names
+ and candidate != pick_ref
+ and candidate not in already_picked
+ ):
Review Comment:
`candidate != pick_ref` is unreachable as false: the regex on line 525
constrains `pick_ref` to `branch-M.N` where N is a digit, while `candidate` is
`branch-M.x` (literal `x`). They can never be equal.
```suggestion
if (
candidate in branch_names
and candidate not in already_picked
):
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]