This is an automated email from the ASF dual-hosted git repository.
potiuk pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/airflow-steward.git
The following commit(s) were added to refs/heads/main by this push:
new 1d921a0 feat(security-issue-sync): five pre-push hygiene gates for
CVE-record quality (#372)
1d921a0 is described below
commit 1d921a071f72ddba0e439ba1afd74e74539d3fdf
Author: Jarek Potiuk <[email protected]>
AuthorDate: Fri May 29 16:37:04 2026 +0200
feat(security-issue-sync): five pre-push hygiene gates for CVE-record
quality (#372)
* feat(security-issue-sync): require named upgrade-target version in Short
public summary on every push
The *"Short public summary for publish"* body field powers the
published CVE description that end users read in the advisory. A
summary that names the vulnerability accurately but lacks the
concrete upgrade-target version (e.g. *"upgrade to the Airflow
version that contains the fix"* without naming `3.3.0`) is a
defect — it forces the reader to open another tab to figure out
which version to pin, exactly the friction the advisory is
supposed to remove.
The skill already enforces this as a gate at the
`pr merged → fix released` transition. Widen the rule so it fires
on **every** sync pass that proposes a body-field update or a JSON
regen, not only at one boundary.
Two changes:
1. Step 2b paragraph: explicitly call out the "every sync pass"
trigger plus the per-scope target shapes (`apache-airflow X.Y.Z
or later`, `apache-airflow-providers-<name> X.Y.Z or later`,
`apache-airflow-helm-chart X.Y.Z or later`).
2. Step 1d signal-table row: detector for "summary populated but
does not name a concrete upgrade-target version". Surfaces the
rewrite proposal on every qualifying pass so the tightening
lands lock-step with the next JSON regen.
Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
* feat(security-issue-sync): add title hygiene +
incomplete-fix-to-another-CVE summary rules
Two complementary additions to security-issue-sync, both validated
on every sync pass (not gated to one transition):
1. Issue-title hygiene. The GitHub issue title is read verbatim
by generate-cve-json into containers.cna.title, so it ships to
cve.org and the published advisory. Re-use the same title-strip
cascade security-cve-allocate applies at allocation time —
project-name tokens, internal split-markers, report-form
classifiers, external-tracker IDs, version-noise suffixes.
Titles drift between allocation and the final regen, so the
cascade has to re-run on every sync. Preserve stripped context
as audit trail in body / rollup; never silently drop it.
2. Incomplete-fix-to-another-CVE summary expansion. When a
tracker is a follow-up to a previously-PUBLISHED CVE — typically
because the original root cause spanned multiple products
(provider + core, chart + core) and the team split per
product/package — the Short public summary for publish must
do more than describe this tracker's vulnerability in isolation.
It must (1) name the prior CVE explicitly, (2) state the prior
fix did not cover the current product, (3) tell users who
already applied the prior fix to also apply this one. Without
this cross-CVE + cross-product framing, downstream readers see
two CVEs about the same root-cause class and assume one fix
covered both — missing that two upgrades are needed.
Both rules are detector + propose-rewrite. The user confirms
each summary/title edit before regen + push to Vulnogram, same
as the existing summary-quality rule.
Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
* feat(security-issue-sync): require trigger conditions (who/when/action)
in Short public summary
Extend the Step 2b summary-quality rule and the matching Step 1d
detector with a fourth gate: the summary must state the triggering
conditions, not only the bug mechanism.
The reader scans the published advisory asking "does this affect
us?" — and the answer comes from trigger context, not from a
mechanical description of the bug. Concretely the summary must
make these three things unambiguous in one sentence each (in any
order):
WHO — attacker role / required capability (an authenticated
UI user with Op permissions, a Dag author, a partner
with write access to the bucket, a worker holding a
valid Execution-API JWT, etc.).
WHEN — deployment shape / config / feature that has to be
active (when [opensearch] host embeds credentials,
when the Kubernetes executor is configured, affects
deployments where the webserver is reachable by
untrusted users, etc.).
ACTION — the step the attacker takes against which surface
(follows a crafted next= redirect URL, uploads an
object containing .. path segments, PATCHes the
deferred-state endpoint with crafted next_kwargs,
etc.).
A summary without all three forces the reader to open the issue
PR / patch to figure out the trigger — exactly the work the
advisory should remove.
Step 2b paragraph: enumerate the three required elements with
worked example wordings (positive + negative).
Step 1d signal-table row: detector heuristic looks for phrases
that mark each of the three elements (the regex set is in the
row); fires when fewer than two of three are unambiguously
present.
Apply this gate in parallel with the upgrade-target rule (same
sync pass) so a rewrite covers both gates at once.
Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
* feat(security-issue-sync): add CWE long-form rule + Step 5b pre-push
hygiene-gate scan
Two more pre-push hygiene gates on top of the four existing ones
(upgrade-target version, trigger conditions, title cleanup,
incomplete-fix cross-CVE clause):
1. CWE field must carry the long-form `CWE-NNN: <Canonical
Title>` shape, not just a bare `CWE-NNN` token. Examples:
`CWE-22: Improper Limitation of a Pathname to a Restricted
Directory ('Path Traversal')`, `CWE-502: Deserialization of
Untrusted Data`, `CWE-601: URL Redirection to Untrusted Site
('Open Redirect')`. The value lands verbatim in the CVE
record's `problemTypes[].descriptions[].description`; readers
don't memorise MITRE numbering. Prefer a CWE from the
project's *advised CWEs* list when one is declared.
2. Step 5b pre-push hygiene-gate scan. Before any
`vulnogram-api-record-update` push, re-scan the JSON about to
be pushed for all five pre-push gates: title strip cascade,
upgrade-target version, trigger conditions, incomplete-fix
cross-CVE clause, CWE long form. When any gate fails the
regenerated JSON, the recovery is to fix the underlying body
field (or title), re-regen, then re-scan — not to push a
degraded record. The gates protect against the case where
body fields drift between the Step 2b proposal cycle and the
actual push (e.g. a user applied only a subset of the proposed
updates).
The pre-push scan is the safety net that catches the "body
proposal landed but the push went out on a stale or partial
edit" failure mode; the individual Step 1d detector rows surface
the rewrite proposals on every sync pass.
Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
---------
Co-authored-by: Claude Opus 4.7 (1M context) <[email protected]>
---
.claude/skills/security-issue-sync/SKILL.md | 194 +++++++++++++++++++++++++++-
1 file changed, 191 insertions(+), 3 deletions(-)
diff --git a/.claude/skills/security-issue-sync/SKILL.md
b/.claude/skills/security-issue-sync/SKILL.md
index 5691083..3083508 100644
--- a/.claude/skills/security-issue-sync/SKILL.md
+++ b/.claude/skills/security-issue-sync/SKILL.md
@@ -761,6 +761,11 @@ update, label change, or next-step recommendation in Step
2:
| The *"Affected versions"* body field is missing, holds a pre-convention
shape, or carries the project's pre-release sentinel, and the tracker is
**not** at `fix released` yet | Propose populating / refining *"Affected
versions"* per the project's convention. The per-scope shape, the pre-release
sentinel (if any), and the lifecycle live in
[`<project-config>/scope-labels.md` — *Affected versions convention by
scope*](../../../<project-config>/scope-labels.md#affected-versions-convention
[...]
| The *"Affected versions"* body field has a value but it is **not
backtick-wrapped** (the raw value, as returned by `gh issue view --json body`,
starts with a `>` character or contains a bare `>=` / `<=` / `<` / `>` token
outside a `` ` `` … `` ` `` span) | Propose wrapping the value in backticks
(e.g. `` `>= 3.0.0, < 3.2.2` ``, `` `< 3.2.2` ``, `` `<= 3.2.1` ``). **Why:**
the leading `>` is the markdown blockquote marker — without backticks, GitHub
renders the rendered field as a quote [...]
| A tracker is transitioning to `fix released` (per the row below) and
*"Affected versions"* still carries the project's pre-release sentinel |
Propose replacing the sentinel with the concrete released version per the
project's convention; see [`<project-config>/scope-labels.md` — *Affected
versions convention by
scope*](../../../<project-config>/scope-labels.md#affected-versions-convention-by-scope)
for the recipe. After the body update, regenerate the CVE JSON attachment so
`versions[] [...]
+| The *"Short public summary for publish"* body field is populated but does
**not** name a concrete upgrade-target version — the rendered text mentions
*"upgrade"* / *"upgrading"* but no `<package> <X.Y.Z>` pattern, or ends with a
generic phrase like *"the version that contains the fix"* / *"a later version"*
/ *"the next release"* | Propose tightening the summary to name the
upgrade-target version verbatim. Resolve the version from the fix PR's
milestone (the canonical signal — set at m [...]
+| The *"Short public summary for publish"* body field is populated but does
**not** state the triggering conditions — the rendered text describes the bug
mechanism without identifying (a) the attacker role / capability, (b) the
deployment configuration that has to be active, OR (c) the action the attacker
takes against which surface. Detector heuristic: scan the summary for any of
these phrases — *"an authenticated [\\w ]+ user"*, *"a Dag author"*, *"an
attacker with"*, *"a user able to" [...]
+| The tracker is an **incomplete-fix follow-up to another CVE** — detected by
any of: the rollup or body mentions *"incomplete fix for `CVE-YYYY-NNNNN`"* /
*"follow-up to `CVE-YYYY-NNNNN`"* / *"sibling tracker"*; the title contains a
*"(incomplete fix for `CVE-YYYY-NNNNN`)"* parenthetical; the `affected[]` array
names a different `packageName` than the referenced prior CVE; OR the tracker
was opened as a split from a closed-`announced` tracker whose CVE is already
PUBLISHED — **AND** the [...]
+| The *"CWE"* body field is populated with a bare `CWE-NNN` token (no
description text) — e.g. `CWE-22` or `CWE-502` alone, without the canonical
short description that follows in the format `CWE-NNN: <Title>` | Propose
expanding the field to `CWE-NNN: <Canonical Title>` per the MITRE CWE catalog
(e.g. `CWE-22: Improper Limitation of a Pathname to a Restricted Directory
('Path Traversal')`, `CWE-502: Deserialization of Untrusted Data`, `CWE-601:
URL Redirection to Untrusted Site ('Open R [...]
+| The **issue title** contains adopter-specific or internal noise that would
otherwise ship to the public CVE record — leading or trailing project-name
tokens (e.g. ``Apache Airflow:`` / ``in Apache Airflow`` / ``(Apache Airflow
X.Y)``), internal split markers (``(split from #NNN)`` / ``(split for scope
clarity from #NNN)``), report-form classifiers (``[ Security Report ]`` /
``[Security Issue]``), external-tracker IDs in parentheses or brackets
(``[GHSA-xxxx-xxxx-xxxx]``, ``(ZDRES-NNNNN [...]
| A release carrying the fix has shipped. Detection is **scope-dependent** —
different scope labels on a project can ride different release trains, each
with its own *"is it released?"* signal (which artifact registry to consult,
what to query, how to map a tracker's milestone to that registry,
partial-release edge cases). The per-scope detection recipe lives in
[`<project-config>/scope-labels.md` — *Detecting that a fix release has
shipped*](../../../<project-config>/scope-labels.md#det [...]
| GHSA state transition (opened, accepted, published, rejected) in a
GHSA-forwarded email | If the GHSA is closed as "not accepted" but the security
team accepted the report on `security@`, flag the divergence in the status
comment so it is not lost. |
| Team member saying *"let's also backport to v3-2-test"* / *"please mark X
for backport"* | Note the requested backport label on the public PR as an item
for Step 9 of the `security-issue-fix` workflow. |
@@ -1377,6 +1382,34 @@ will change and *why*. Group them by category:
(`announced - emails sent`, `announced`,
`vendor-advisory`) keep the release manager because the advisory
lifecycle is theirs. Do **not** shuffle assignees back and forth.
+- **Issue title hygiene** — the GitHub issue title ships verbatim into
+ the CVE record's `containers.cna.title` field (read by the
+ `generate-cve-json` script on every regen) and from there into the
+ published advisory and `cve.org`. **On every sync pass**, run the
+ same title-strip cascade the
+ [`security-cve-allocate` skill applies at allocation
time](../security-cve-allocate/SKILL.md#step-2--compute-the-cve-ready-title) —
+ strip leading/trailing project-name tokens (e.g. ``<project>:``,
+ ``in <project>``, ``(<project> X.Y)``), internal
+ split-markers (``(split from #NNN)``), report-form classifiers
+ (``[Security Report]``, ``[Security Issue]``), external-tracker IDs
+ (``[GHSA-...]``, ``(ZDRES-...)``, ``(HUNTR-...)``, ``(GHSL-...)``)
+ and version-noise suffixes (``(v3.2.1)``, ``(3.x)``). When the
+ cascade would change the title, propose the diff as a numbered
+ Step 2b item; on confirmation, ``gh issue edit <N> --title
+ "<cleaned>"`` and then regen + push CVE JSON so the record's
+ `title` field picks up the cleaned value. **Preserve stripped
+ context as audit trail** — split-from references, GHSA IDs,
+ internal report-form classifiers all carry information the
+ security team uses to navigate sibling reports and reviewer
+ threads. Move them into the issue body (a `### Related references`
+ section) or into the rollup as an audit entry; never silently
+ drop them. Titles drift between allocation and the final regen
+ (manual edits, sibling-tracker splits, GHSA-relay imports that
+ append the GHSA ID), so the cascade has to re-run on every sync
+ even when no other body update is being proposed. The Step 1d
+ signal-table row *"The issue title contains adopter-specific or
+ internal noise"* is the detector that surfaces the cleanup
+ proposal on every qualifying pass.
- **Description fields** — if the issue body is missing any of the fields the
release manager will eventually need (CWE, product, affected versions,
severity,
CVE ID, credits, links to PRs, short public summary for publish), propose a
@@ -1432,9 +1465,125 @@ will change and *why*. Group them by category:
the fixed version to upgrade to, the mitigations available for users
who cannot upgrade immediately, and the CWE class (allowed and
useful — CWE is not embargoed information once the advisory ships).
- When the field is technically accurate but missing the action a user
- should take, propose a rewrite — even when the rest of the gate at
- the `pr merged → fix released` transition is otherwise clear.
+
+ **Validate this on every sync pass that proposes a body-field update
+ or a JSON regen**, not only at the `pr merged → fix released`
+ boundary. A summary that names the vulnerability accurately but
+ lacks the upgrade-target version (e.g. *"upgrade to the Airflow
+ version that contains the fix"* without naming `3.3.0`) is a
+ defect; propose tightening it before regen lands in the embedded
+ JSON + the next push to the CVE record.
+
+ **The summary must also state the triggering conditions** — the
+ reader scans the published advisory asking *"does this affect us?"*,
+ and the answer comes from the trigger context, not the bug
+ mechanism. Concretely the summary should make these three things
+ unambiguous in one sentence each (in any order):
+
+ 1. **Who** — the attacker role / capability required (e.g. *"an
+ authenticated UI user with `Op` permissions"*, *"a Dag author"*,
+ *"a partner with write access to the source bucket"*, *"a worker
+ holding a valid Execution-API JWT"*, *"a user able to reach the
+ login endpoint"*).
+ 2. **When / configuration** — the deployment shape / config /
+ feature that has to be active for the issue to apply (e.g.
+ *"when `[opensearch] host` embeds credentials"*, *"when the
+ Kubernetes executor is configured"*, *"when the
+ `apache-airflow-providers-keycloak` auth manager is enabled"*,
+ *"when DAGs with assets are configured to materialise via the
+ REST API"*).
+ 3. **Action / surface** — the step the attacker takes against
+ which surface (e.g. *"follows a crafted `next=` redirect URL"*,
+ *"uploads an object containing `..` path segments"*, *"reads
+ task logs in the UI"*, *"PATCHes the deferred-state endpoint
+ with crafted `next_kwargs`"*).
+
+ The condition tuple lets a reader who is *not* familiar with the
+ internal code paths decide whether their deployment is exposed
+ without opening the source or the original report. A summary that
+ omits any of the three forces them to read the issue PR / patch
+ to figure out the trigger — exactly the work the advisory is
+ meant to remove. When the field is technically accurate but
+ missing one of (who / when / action), propose adding it on the
+ same sync pass as the upgrade-target tightening.
+
+ Worked example shape (a single ASF Airflow CVE):
+
+ > *"An authenticated UI user with permission to read DAGs could
+ > craft a `next=` parameter on the login route that bypassed
+ > `is_safe_url`, redirecting other users to an attacker-controlled
+ > origin after authentication. Affects deployments where the
+ > webserver is reachable by untrusted users. Users are advised to
+ > upgrade to `apache-airflow` 3.2.2 or later."*
+
+ The first sentence names the attacker (*authenticated UI user*),
+ the action (*crafts `next=`*), and the surface (*login route*); the
+ second sentence names the configuration (*webserver reachable by
+ untrusted users*); the third is the upgrade ask. When the carrier release
+ is known (the fix PR's milestone is set), name it verbatim —
+ ``apache-airflow 3.3.0 or later``,
+ ``apache-airflow-providers-google 11.2.0 or later``,
+ ``apache-airflow-helm-chart 1.18.0 or later``, etc. When the
+ carrier release is not yet known (early `pr created` state where
+ the PR has no milestone), keep the placeholder but flag the gap
+ in Step 2c so the next sync after milestone-set catches it. The
+ Step 1d signal-table row *"`Short public summary for publish` is
+ populated but does not name a concrete upgrade-target version"*
+ is the detector that surfaces the rewrite proposal on every
+ qualifying pass.
+
+ **Incomplete-fix-to-another-CVE: the summary must name the prior
+ CVE *and* tell users who already applied that fix to apply this
+ one too.** When the tracker is an *incomplete-fix follow-up* to a
+ previously-published CVE — detected by the rollup, the body, or
+ the issue title mentioning *"incomplete fix for `<CVE-ID>`"*,
+ *"follow-up to `<CVE-ID>`"*, *"split from"* a sibling tracker
+ whose CVE is already PUBLIC, or by the title's prior-CVE token —
+ the summary must additionally:
+
+ 1. Name the prior CVE explicitly (``<project> previously
+ released a fix for `<PRIOR-CVE-ID>` that addressed the
+ `<other-package>` side of the same vulnerability class``).
+ 2. State that the **previous fix did not cover the current
+ product / surface** (e.g. ``The previous fix covered the
+ `<sibling-package>` package; the `<current-package>` package
+ was not patched at the time``).
+ 3. Tell users who already applied the prior CVE's fix to **also
+ apply this one** (``Users who already upgraded
+ `<sibling-package>` per the `<PRIOR-CVE-ID>` advisory should
+ additionally upgrade `<current-package>` to <X.Y.Z> or later
+ — the two fixes are complementary, not duplicates``).
+
+ **Why this matters.** When a CVE is published as a "follow-up"
+ to an earlier CVE, the reader's natural reading is *"I already
+ applied the earlier fix; this one is a duplicate"*. Without
+ explicit cross-CVE + cross-product framing, downstream consumers
+ miss that two upgrades are needed (one per product / package) to
+ close the original vulnerability fully. The advisory has to do
+ the work of explaining the split — the CVE ID alone is not
+ enough signal.
+
+ **Detection signals** (any one triggers the cross-CVE summary
+ shape):
+
+ - The tracker's `Short public summary` already mentions the
+ prior `CVE-YYYY-NNNNN` token but lacks the *"users who
+ applied the prior fix should also..."* clause.
+ - The rollup carries a *"sibling tracker"* / *"split for scope
+ clarity"* / *"follow-up to `<PRIOR-CVE-ID>`"* entry.
+ - The issue title contains an explicit *"incomplete fix for
+ `<PRIOR-CVE-ID>`"* parenthetical (per the title-strip
+ cascade — that token is stripped from the title but the
+ cross-CVE relationship is preserved in body / rollup).
+ - The CVE record's `affected[]` array names a different
+ `packageName` than the prior CVE's record, AND the prior CVE
+ is on the same root-cause class.
+
+ When any signal fires, propose the cross-CVE summary expansion
+ as part of the same Step 2b body-field update set. Do **not**
+ silently emit a summary that omits the cross-CVE / cross-product
+ upgrade ask — that creates the "I already applied the fix"
+ blind spot the rule exists to prevent.
**Special case for the "Security mailing list thread" field — leave
it alone.** This field holds the internal navigation reference to
@@ -2749,6 +2898,45 @@ Step 6 below describes how to verify the state advance
landed
(no CVE allocated; tracker closed as invalid / duplicate / not
CVE worthy). There is no record to push to.
+1b. **Pre-push hygiene-gate scan.** Before any push call, re-scan
+ the JSON about to be pushed for the five pre-push gates that
+ make the published CVE record user-facing:
+
+ - **Title strip cascade** — `containers.cna.title` must have
+ gone through the [`security-cve-allocate` Step 2
cascade](../security-cve-allocate/SKILL.md#step-2--compute-the-cve-ready-title)
+ and contain no project-name prefix/suffix, no `[GHSA-...]` /
+ `(ZDRES-...)` / `(HUNTR-...)` / `(GHSL-...)` external tracker
+ IDs, no `(split from #NNN)` markers, no `[Security Report]`
+ classifier, no version-noise suffix. The cascade is the same
+ one the issue-title hygiene Step 1d row enforces; this gate
+ re-runs it on the JSON's `title` field directly because the
+ generator reads the GitHub issue title verbatim and the JSON
+ value is what actually ships.
+ - **Short public summary names an upgrade-target version** —
+ `descriptions[0].value` must contain a `<package> <X.Y.Z>`
+ pattern; bare *"upgrade to the version that contains the
+ fix"* fails.
+ - **Short public summary states trigger conditions** — the
+ who / when / action triplet from the Step 2b paragraph
+ above; at least two of three must be unambiguously present.
+ - **Incomplete-fix cross-CVE clause** — when the tracker is
+ a follow-up to a prior PUBLISHED CVE (the rollup or body
+ declares the relationship), the summary must name the prior
+ CVE AND tell users who applied the prior fix to also apply
+ this one.
+ - **CWE field has the long-form description** —
+ `problemTypes[0].descriptions[0].description` must be in
+ the `CWE-NNN: <Title>` shape, not a bare `CWE-NNN` token.
+
+ When any gate fails the JSON the regen just produced, the
+ right recovery is **not** to push — fix the underlying body
+ field (or title, for the title gate), re-regen, then re-scan.
+ The gates exist to catch the cases where the body fields drift
+ between the Step 2b proposal cycle and the actual push (e.g.
+ a Step 2b proposal landed but the user edited only a subset
+ of the proposed updates). Skipping the push on a gate failure
+ forces the next sync iteration to surface the remaining edits.
+
2. **Probe the session** — `vulnogram-api-check`:
```bash