jscheffl commented on code in PR #63672: URL: https://github.com/apache/airflow/pull/63672#discussion_r2937380313
########## contributing-docs/25_maintainer_pr_triage.rst: ########## @@ -0,0 +1,457 @@ + .. Licensed to the Apache Software Foundation (ASF) under one + or more contributor license agreements. See the NOTICE file + distributed with this work for additional information + regarding copyright ownership. The ASF licenses this file + to you under the Apache License, Version 2.0 (the + "License"); you may not use this file except in compliance + with the License. You may obtain a copy of the License at + + .. http://www.apache.org/licenses/LICENSE-2.0 + + .. Unless required by applicable law or agreed to in writing, + software distributed under the License is distributed on an + "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + KIND, either express or implied. See the License for the + specific language governing permissions and limitations + under the License. + +Maintainer PR Triage and Review +=============================== + +This document describes the **auto-triage** workflow — a maintainer-driven tool that helps +Apache Airflow maintainers triage and review incoming Pull Requests faster and in a more +informed way. The tool is part of the Breeze development environment and is invoked via +``breeze pr auto-triage``. + +.. contents:: Table of Contents + :depth: 2 + :local: + +Overview +-------- + +Apache Airflow receives a high volume of Pull Requests from contributors around the world. +Maintainers need to assess each PR for basic quality criteria, CI status, merge conflicts, +and code correctness before it can be merged. The **auto-triage** tool streamlines this +process by combining deterministic checks with optional LLM-assisted analysis, while +keeping the maintainer firmly in control of every decision. + +**Key principles:** + +- **Maintainer-driven** — The tool presents information and suggests actions, but every + decision (drafting, closing, commenting, approving) is made by a human maintainer + through interactive prompts. The tool never takes autonomous action on PRs. + +- **Two-stage process** — The workflow is split into two distinct stages: a fast **triage** + pass that checks basic quality criteria, and a deeper **review** pass that performs + detailed code analysis with LLM assistance. + +- **Better-informed decisions** — By aggregating CI status, merge conflicts, unresolved + review threads, main-branch failure patterns, and LLM assessments into a single + interactive session, maintainers can make faster and more consistent decisions. + +Two-stage workflow +------------------ + +The auto-triage tool operates in two modes that correspond to the two stages of the PR +lifecycle: + +.. code-block:: text + + ┌─────────────────────────────────────────────────────────────────────────┐ + │ PR Lifecycle with Auto-Triage │ + │ │ + │ Contributor opens PR │ + │ │ │ + │ ▼ │ + │ ┌───────────┐ Maintainer runs ┌──────────────────────┐ │ + │ │ Open PR │───────────────────────▶ │ Stage 1: TRIAGE │ │ + │ │ (no label) │ breeze pr auto-triage │ Basic quality check │ │ + │ └───────────┘ └──────────┬───────────┘ │ + │ │ │ + │ ┌───────────────────────┼──────────┐ │ + │ │ │ │ │ + │ ▼ ▼ ▼ │ + │ Issues found Looks good Suspicious │ + │ │ │ │ │ + │ ▼ ▼ ▼ │ + │ Convert to Draft Add "ready for Close │ + │ with comment maintainer all PRs │ + │ │ review" label by │ + │ │ │ author │ + │ ▼ │ │ + │ Contributor fixes │ │ + │ and marks Ready │ │ + │ │ │ │ + │ └──────────┬────────────┘ │ + │ │ │ + │ ▼ │ + │ ┌──────────────────────┐ │ + │ Maintainer runs │ Stage 2: REVIEW │ │ + │ breeze pr auto-triage │ Detailed code │ │ + │ --mode review │ review with LLM │ │ + │ └──────────┬───────────┘ │ + │ │ │ + │ ┌──────────┼──────────┐ │ + │ │ │ │ │ + │ ▼ ▼ ▼ │ + │ Comments Approve Request │ + │ posted PR changes │ + │ │ │ + │ ▼ │ + │ Merge │ + └─────────────────────────────────────────────────────────────────────────┘ + + +Stage 1: Triage +--------------- + +The triage stage is the first pass over incoming PRs. It focuses on whether each PR meets +the project's basic `quality criteria <05_pull_requests.rst#pull-request-quality-criteria>`__ +and is ready for deeper review. It is invoked with: + +.. code-block:: bash + + breeze pr auto-triage + +This is the default mode (``--mode triage``). + +What the triage stage checks +............................. + +The triage stage performs a series of **deterministic checks** on each PR: + +1. **CI status** — Are the CI checks passing, failing, or still running? PRs with + in-progress workflows are skipped until the next triage run. A 4-hour grace period + prevents flagging very recent failures (the author may still be iterating). + +2. **Merge conflicts** — Does the PR have merge conflicts with the base branch? If so, + the author needs to rebase. + +3. **Unresolved review threads** — Are there open review conversations that the author + has not addressed? + +4. **Workflow approval** — For PRs from first-time contributors, CI workflows need + maintainer approval before they can run. The triage tool presents these PRs first + so maintainers can review the diff for security concerns before approving. + +After deterministic checks, PRs that pass are optionally sent to an **LLM for quality +assessment**. The LLM evaluates the PR title, description, and metadata against the +project's `Pull Request guidelines <05_pull_requests.rst#pull-request-guidelines>`__ +and flags potential violations such as: + +- Generic or unclear PR titles +- Missing or inadequate descriptions +- Missing Gen-AI disclosure (when AI-generated patterns are detected) +- Unrelated changes bundled together + +How triage processes PRs +........................ + +.. code-block:: text + + ┌──────────────────────────────────────────────────────────────────┐ + │ Triage Processing Pipeline │ + │ │ + │ Fetch PRs via GraphQL │ + │ │ │ + │ ▼ │ + │ Filter: exclude already-triaged, drafts with known issues │ + │ │ │ + │ ▼ │ + │ Enrich: fetch CI checks, merge status, review threads │ + │ │ │ + │ ├──────────────────────────┐ │ + │ │ │ │ + │ ▼ ▼ │ + │ Deterministic checks LLM assessment │ + │ (CI, conflicts, (title, description, │ + │ unresolved threads) quality criteria) │ + │ │ │ │ + │ ▼ ▼ │ + │ ┌─────────────────────────────────────────────┐ │ + │ │ Interactive maintainer session │ │ + │ │ │ │ + │ │ For each PR, display: │ │ + │ │ • PR title, author, age, labels │ │ + │ │ • CI status with failure details │ │ + │ │ • Merge conflict status │ │ + │ │ • Unresolved thread summary │ │ + │ │ • LLM assessment (if available) │ │ + │ │ • Main-branch failure patterns │ │ + │ │ (to distinguish systemic CI failures) │ │ + │ │ │ │ + │ │ Maintainer chooses action: │ │ + │ │ [d]raft [c]omment [x]close [r]erun CI │ │ + │ │ [b]rebase [m]ark ready [s]kip [q]uit │ │ + │ └─────────────────────────────────────────────┘ │ + └──────────────────────────────────────────────────────────────────┘ + +Available triage actions +........................ + +When the tool presents a PR, the maintainer can choose from these actions: + +.. list-table:: + :header-rows: 1 + :widths: 15 85 + + * - Action + - Description + * - **[d]raft** + - Convert the PR to draft status and post a comment listing the issues found. + The maintainer can select which violations to include. This signals to the + contributor that they should fix the listed issues and mark the PR as + "Ready for review" once done. + * - **[c]omment** + - Post a comment listing the issues without converting to draft. Useful when + the contributor is actively working on the PR. + * - **[x]close** + - Close the PR with a comment explaining the quality violations. Used when a + contributor has multiple PRs with repeated quality issues. + * - **[r]erun** + - Rerun failed CI checks. Useful when failures appear to be transient or + caused by infrastructure issues. + * - **[b]rebase** + - Suggest that the author rebase onto the latest base branch. + * - **[m]ark** + - Add the ``ready for maintainer review`` label, signaling that the PR has + passed basic quality checks and is ready for the deeper review stage. + * - **[s]kip** + - Skip the PR without taking any action. + * - **[o]pen** + - Open the PR in the browser for manual inspection. + * - **[w]show** + - Display the PR diff inline in the terminal. + * - **[q]uit** + - Exit the triage session. Progress on already-processed PRs is preserved. + +CI failure analysis +................... + +The triage tool provides context to help maintainers distinguish between failures caused +by the PR and systemic failures on the main branch: + +- **Main-branch failure patterns** — The tool fetches recent merged PRs and identifies + checks that are consistently failing across the repository. When a PR's failed check + matches a known main-branch failure, this is highlighted so the maintainer knows not + to penalize the contributor. + +- **Canary build status** — The status of scheduled canary builds on the main branch is + displayed, giving maintainers a quick view of overall CI health. + +- **Grace period** — Failures less than 4 hours old are treated as recent (the author + may still be iterating), and the PR is not flagged for those failures. + + +Stage 2: Review +--------------- + +The review stage is a deeper, LLM-assisted code review of PRs that have already passed +triage and carry the ``ready for maintainer review`` label. It is invoked with: + +.. code-block:: bash + + breeze pr auto-triage --mode review + +What the review stage does +.......................... + +.. code-block:: text + + ┌──────────────────────────────────────────────────────────────────┐ + │ Review Processing Pipeline │ + │ │ + │ Fetch PRs labeled "ready for maintainer review" │ + │ │ │ + │ ├──────────────────────────┐ │ + │ │ │ │ + │ ▼ ▼ │ + │ Deterministic checks LLM code review │ + │ (CI, conflicts, (fetches full diff, │ + │ unresolved threads) analyzes code changes) │ + │ │ │ │ + │ │ ┌─────────────────────┘ │ + │ │ │ (runs in parallel) │ + │ ▼ ▼ │ + │ ┌─────────────────────────────────────────────┐ │ + │ │ Interactive review session │ │ + │ │ │ │ + │ │ Phase 1: Deterministic failures │ │ + │ │ • Present PRs with CI/conflict issues │ │ + │ │ │ │ + │ │ Phase 2: LLM code review results │ │ + │ │ • Overall assessment (approve/comment/ │ │ + │ │ request changes) │ │ + │ │ • Line-level review comments │ │ + │ │ │ │ + │ │ For each comment, maintainer chooses: │ │ + │ │ [s]ubmit [e]dit [o]pen in browser │ │ + │ │ [k]skip [q]uit │ │ + │ └─────────────────────────────────────────────┘ │ + └──────────────────────────────────────────────────────────────────┘ Review Comment: ```suggestion ┌──────────────────────────────────────────────────────────────┐ │ Review Processing Pipeline │ │ │ │ Fetch PRs labeled "ready for maintainer review" │ │ │ │ │ ├──────────────────────────┐ │ │ │ │ │ │ ▼ ▼ │ │ Deterministic checks LLM code review │ │ (CI, conflicts, (fetches full diff, │ │ unresolved threads) analyzes code changes) │ │ │ │ │ │ │ ┌─────────────────────┘ │ │ │ │ (runs in parallel) │ │ ▼ ▼ │ │ ┌─────────────────────────────────────────────┐ │ │ │ Interactive review session │ │ │ │ │ │ │ │ Phase 1: Deterministic failures │ │ │ │ • Present PRs with CI/conflict issues │ │ │ │ │ │ │ │ Phase 2: LLM code review results │ │ │ │ • Overall assessment (approve/comment/ │ │ │ │ request changes) │ │ │ │ • Line-level review comments │ │ │ │ │ │ │ │ For each comment, maintainer chooses: │ │ │ │ [s]ubmit [e]dit [o]pen in browser │ │ │ │ [k]skip [q]uit │ │ │ └─────────────────────────────────────────────┘ │ └──────────────────────────────────────────────────────────────┘ ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
