This is an automated email from the ASF dual-hosted git repository.
potiuk pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/airflow-steward.git
The following commit(s) were added to refs/heads/main by this push:
new e2b58f1 feat(security-issue-sync): tune pre-flight classifier —
skill-marker detection + relaxed rules (#416)
e2b58f1 is described below
commit e2b58f192b5277580a270b8a48e9e6d437921b03
Author: Jarek Potiuk <[email protected]>
AuthorDate: Sun May 31 14:31:28 2026 +0200
feat(security-issue-sync): tune pre-flight classifier — skill-marker
detection + relaxed rules (#416)
A dry-run of #414's pre-flight against a real adopter tracker
revealed the original rules misfired in two ways:
- The "last comment author is a bot" check was structurally
unreachable on single-operator private trackers where the sync
skill writes rollup updates as the operator's personal GitHub
user, not as a *[bot] account.
- The 7-day updatedAt safety override caught most trackers
because every tracker had been touched by the recent sync
itself (rollup-comment writes, label flips) — conflating
skill activity with substantive activity. Skip rate measured
~5% in this setup vs the predicted 30-50%.
This tunes the classifier with two changes:
1. Skill-or-bot detection. Treat a comment as bot-equivalent
when its body starts with the skill marker
`<!-- apache-steward: ` (matches every status-rollup,
release-manager hand-off, and wrap-up comment the framework
writes). Falls back to the original `*[bot]` login check, plus
an override-file hook for adopters with personal-account bots.
Requires fetching body on the last comment — bumps query
response size moderately (still cheaper than one subagent
transcript), and the body field is what enables the
skill-marker detection that drives most of the real-world
skip rate.
2. Relaxed lifecycle skip rules. The original "idle > 14d"
gates were a safety net for the broken bot-detection. With
skill-or-bot detection working, the "all phases done; awaiting
release" / "fix released; awaiting advisory" patterns are
skip-eligible regardless of comment age — the skill marker
itself is the "nothing new since last sync" signal.
Re-running the dry-run on the same setup: skip rate ~5% → ~30%,
and the skipped trackers were all correctly steady-state ones.
Adds a new "fix released; awaiting advisory propagation" skip
rule for the `cve allocated + fix released` label set — the
single largest contributor to the new skip count.
---
.claude/skills/security-issue-sync/bulk-mode.md | 65 +++++++++++++++++++------
1 file changed, 50 insertions(+), 15 deletions(-)
diff --git a/.claude/skills/security-issue-sync/bulk-mode.md
b/.claude/skills/security-issue-sync/bulk-mode.md
index d8992a4..e3104a4 100644
--- a/.claude/skills/security-issue-sync/bulk-mode.md
+++ b/.claude/skills/security-issue-sync/bulk-mode.md
@@ -71,7 +71,9 @@ concurrently, which is exactly what the sync needs.
**One query, one round-trip.** Build an aliased multi-field
GraphQL query that fetches state for every resolved issue at
- once:
+ once. The `body` field on `comments(last: 1)` lets the classifier
+ distinguish skill-authored writes from human activity (see
+ *Skill-or-bot detection* below):
```bash
gh api graphql --raw-field query="$(cat <<'GQL'
@@ -80,7 +82,9 @@ concurrently, which is exactly what the sync needs.
i<N1>: issue(number: <N1>) {
number state closedAt updatedAt
labels(first: 30) { nodes { name } }
- comments(last: 1) { nodes { author { login } createdAt } }
+ comments(last: 1) {
+ nodes { author { login } createdAt body }
+ }
}
i<N2>: issue(number: <N2>) { ... }
# repeat one aliased block per resolved issue
@@ -92,8 +96,40 @@ concurrently, which is exactly what the sync needs.
The aliased-field form (`i<N>: issue(number: <N>) { ... }`)
works for any number of issues in a single query. For a 30-issue
- bulk sweep the request is ~3 KB and the response is ~6 KB —
- cheaper than a single subagent transcript.
+ bulk sweep the request is ~3 KB and the response is ~50-130 KB
+ depending on how long the latest comments are — still cheaper
+ than even one subagent transcript, and the body field is what
+ enables the skill-marker detection that drives ~30% of the
+ real-world skip rate.
+
+ **Skill-or-bot detection — required for the rules below.** On a
+ private single-operator tracker, the sync skill itself writes
+ rollup updates and RM hand-off comments as the operator's
+ GitHub user — *not* as a `*[bot]` account. A naive
+ *"last comment author is a bot"* check is structurally
+ unreachable on those trackers and the classifier degenerates
+ to ~5% skip rate. The fix: recognise skill-authored comments
+ by their **marker comment**, which every status-rollup /
+ hand-off / wrap-up comment begins with:
+
+ ```text
+ <!-- apache-steward: <comment-kind> v<N> -->
+ ```
+
+ Concretely, treat the last-comment author as *bot-equivalent* if
+ **any** of these is true:
+
+ - `login in {github-actions[bot], dependabot[bot]}` or
+ `login` ends with `[bot]` (real GitHub App accounts).
+ - The body starts with `<!-- apache-steward: ` (skill-authored
+ comment — see
[`tools/github/status-rollup.md`](../../../tools/github/status-rollup.md)
+ for the marker spec).
+ - `login` matches a `bot_logins:` entry in the override file at
+
[`.apache-steward-overrides/security-issue-sync.md`](../../../docs/setup/agentic-overrides.md)
+ (for adopters with personal-account bots).
+
+ The remaining rules use *"skill-or-bot last commenter"* as a
+ shorthand for this composite check.
**Classification rule table.** Apply the rules **in order**;
the first match wins. Conservative by design — `skip-noop`
@@ -101,21 +137,20 @@ concurrently, which is exactly what the sync needs.
| Signals | Decision | Reason recorded in recap |
|---|---|---|
- | `updatedAt` within the last **7 days** | `dispatch` | recent activity
safety override — never skip |
- | Last comment author is **not** a bot AND `createdAt` within last **24h**
| `dispatch-urgent` | reporter just replied |
+ | `updatedAt` within last **7 days** AND last comment is **NOT**
skill-or-bot | `dispatch` | recent human activity — safety override |
+ | Last comment author is **not** skill-or-bot AND `createdAt` within last
**24h** | `dispatch-urgent` | reporter just replied |
| Closed > **30 days** ago AND has `announced` label | `skip-noop` |
`post-announce; CVE published` |
| Closed > **90 days** ago AND no `announced` label | `skip-noop` | `stale
closed (invalid/duplicate/abandoned)` |
- | Open AND has `cve allocated` + `pr merged` + `announced` AND last
comment > 14d ago AND last comment author is a bot | `skip-noop` | `all phases
done; awaiting closure heuristic` |
- | Open AND has `cve allocated` + `pr merged` AND last comment > 14d ago
AND last comment author is a bot | `skip-noop` | `awaiting release` |
+ | Open AND has `cve allocated` + `pr merged` + `announced` AND last
comment is skill-or-bot | `skip-noop` | `all phases done; awaiting closure
heuristic` |
+ | Open AND has `cve allocated` + `pr merged` AND last comment is
skill-or-bot | `skip-noop` | `awaiting release` |
+ | Open AND has `cve allocated` + `fix released` AND last comment is
skill-or-bot | `skip-noop` | `fix released; awaiting advisory propagation` |
| Anything else | `dispatch` | — |
- **Bot detection.** The "author is a bot" test is *login matches
- one of*: `github-actions[bot]`, `dependabot[bot]`, the
- project's `<sync-bot>` handle if configured in
- [`<project-config>/project.md`](../../../<project-config>/project.md),
- or any GitHub user with the `[bot]` suffix. If the project has
- a personal-account bot, list it in the override file at
-
[`.apache-steward-overrides/security-issue-sync.md`](../../../docs/setup/agentic-overrides.md).
+ The relaxation vs the original rule design: skip-eligibility
+ rules no longer require *"idle > 14 days"*. Once the labels
+ show steady-state AND the last write was the skill itself,
+ a recently-bumped `updatedAt` is just the skill's own work —
+ not a reason to dispatch.
**Hard rules**: