jscheffl commented on code in PR #63672: URL: https://github.com/apache/airflow/pull/63672#discussion_r2937376824
########## contributing-docs/25_maintainer_pr_triage.rst: ########## @@ -0,0 +1,457 @@ + .. Licensed to the Apache Software Foundation (ASF) under one + or more contributor license agreements. See the NOTICE file + distributed with this work for additional information + regarding copyright ownership. The ASF licenses this file + to you under the Apache License, Version 2.0 (the + "License"); you may not use this file except in compliance + with the License. You may obtain a copy of the License at + + .. http://www.apache.org/licenses/LICENSE-2.0 + + .. Unless required by applicable law or agreed to in writing, + software distributed under the License is distributed on an + "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + KIND, either express or implied. See the License for the + specific language governing permissions and limitations + under the License. + +Maintainer PR Triage and Review +=============================== + +This document describes the **auto-triage** workflow — a maintainer-driven tool that helps +Apache Airflow maintainers triage and review incoming Pull Requests faster and in a more +informed way. The tool is part of the Breeze development environment and is invoked via +``breeze pr auto-triage``. + +.. contents:: Table of Contents + :depth: 2 + :local: + +Overview +-------- + +Apache Airflow receives a high volume of Pull Requests from contributors around the world. +Maintainers need to assess each PR for basic quality criteria, CI status, merge conflicts, +and code correctness before it can be merged. The **auto-triage** tool streamlines this +process by combining deterministic checks with optional LLM-assisted analysis, while +keeping the maintainer firmly in control of every decision. + +**Key principles:** + +- **Maintainer-driven** — The tool presents information and suggests actions, but every + decision (drafting, closing, commenting, approving) is made by a human maintainer + through interactive prompts. The tool never takes autonomous action on PRs. + +- **Two-stage process** — The workflow is split into two distinct stages: a fast **triage** + pass that checks basic quality criteria, and a deeper **review** pass that performs + detailed code analysis with LLM assistance. + +- **Better-informed decisions** — By aggregating CI status, merge conflicts, unresolved + review threads, main-branch failure patterns, and LLM assessments into a single + interactive session, maintainers can make faster and more consistent decisions. + +Two-stage workflow +------------------ + +The auto-triage tool operates in two modes that correspond to the two stages of the PR +lifecycle: + +.. code-block:: text + + ┌─────────────────────────────────────────────────────────────────────────┐ + │ PR Lifecycle with Auto-Triage │ + │ │ + │ Contributor opens PR │ + │ │ │ + │ ▼ │ + │ ┌───────────┐ Maintainer runs ┌──────────────────────┐ │ + │ │ Open PR │───────────────────────▶ │ Stage 1: TRIAGE │ │ + │ │ (no label) │ breeze pr auto-triage │ Basic quality check │ │ + │ └───────────┘ └──────────┬───────────┘ │ + │ │ │ + │ ┌───────────────────────┼──────────┐ │ + │ │ │ │ │ + │ ▼ ▼ ▼ │ + │ Issues found Looks good Suspicious │ + │ │ │ │ │ + │ ▼ ▼ ▼ │ + │ Convert to Draft Add "ready for Close │ + │ with comment maintainer all PRs │ + │ │ review" label by │ + │ │ │ author │ + │ ▼ │ │ + │ Contributor fixes │ │ + │ and marks Ready │ │ + │ │ │ │ + │ └──────────┬────────────┘ │ + │ │ │ + │ ▼ │ + │ ┌──────────────────────┐ │ + │ Maintainer runs │ Stage 2: REVIEW │ │ + │ breeze pr auto-triage │ Detailed code │ │ + │ --mode review │ review with LLM │ │ + │ └──────────┬───────────┘ │ + │ │ │ + │ ┌──────────┼──────────┐ │ + │ │ │ │ │ + │ ▼ ▼ ▼ │ + │ Comments Approve Request │ + │ posted PR changes │ + │ │ │ + │ ▼ │ + │ Merge │ + └─────────────────────────────────────────────────────────────────────────┘ + + +Stage 1: Triage +--------------- + +The triage stage is the first pass over incoming PRs. It focuses on whether each PR meets +the project's basic `quality criteria <05_pull_requests.rst#pull-request-quality-criteria>`__ +and is ready for deeper review. It is invoked with: + +.. code-block:: bash + + breeze pr auto-triage + +This is the default mode (``--mode triage``). + +What the triage stage checks +............................. + +The triage stage performs a series of **deterministic checks** on each PR: + +1. **CI status** — Are the CI checks passing, failing, or still running? PRs with + in-progress workflows are skipped until the next triage run. A 4-hour grace period + prevents flagging very recent failures (the author may still be iterating). + +2. **Merge conflicts** — Does the PR have merge conflicts with the base branch? If so, + the author needs to rebase. + +3. **Unresolved review threads** — Are there open review conversations that the author + has not addressed? + +4. **Workflow approval** — For PRs from first-time contributors, CI workflows need + maintainer approval before they can run. The triage tool presents these PRs first + so maintainers can review the diff for security concerns before approving. + +After deterministic checks, PRs that pass are optionally sent to an **LLM for quality +assessment**. The LLM evaluates the PR title, description, and metadata against the +project's `Pull Request guidelines <05_pull_requests.rst#pull-request-guidelines>`__ +and flags potential violations such as: + +- Generic or unclear PR titles +- Missing or inadequate descriptions +- Missing Gen-AI disclosure (when AI-generated patterns are detected) +- Unrelated changes bundled together + +How triage processes PRs +........................ + +.. code-block:: text + + ┌──────────────────────────────────────────────────────────────────┐ + │ Triage Processing Pipeline │ + │ │ + │ Fetch PRs via GraphQL │ + │ │ │ + │ ▼ │ + │ Filter: exclude already-triaged, drafts with known issues │ + │ │ │ + │ ▼ │ + │ Enrich: fetch CI checks, merge status, review threads │ + │ │ │ + │ ├──────────────────────────┐ │ + │ │ │ │ + │ ▼ ▼ │ + │ Deterministic checks LLM assessment │ + │ (CI, conflicts, (title, description, │ + │ unresolved threads) quality criteria) │ + │ │ │ │ + │ ▼ ▼ │ + │ ┌─────────────────────────────────────────────┐ │ + │ │ Interactive maintainer session │ │ + │ │ │ │ + │ │ For each PR, display: │ │ + │ │ • PR title, author, age, labels │ │ + │ │ • CI status with failure details │ │ + │ │ • Merge conflict status │ │ + │ │ • Unresolved thread summary │ │ + │ │ • LLM assessment (if available) │ │ + │ │ • Main-branch failure patterns │ │ + │ │ (to distinguish systemic CI failures) │ │ + │ │ │ │ + │ │ Maintainer chooses action: │ │ + │ │ [d]raft [c]omment [x]close [r]erun CI │ │ + │ │ [b]rebase [m]ark ready [s]kip [q]uit │ │ + │ └─────────────────────────────────────────────┘ │ + └──────────────────────────────────────────────────────────────────┘ Review Comment: ```suggestion ┌─────────────────────────────────────────────────────────────────┐ │ Triage Processing Pipeline │ │ │ │ Fetch PRs via GraphQL │ │ │ │ │ ▼ │ │ Filter: exclude already-triaged, drafts with known issues │ │ │ │ │ ▼ │ │ Enrich: fetch CI checks, merge status, review threads │ │ │ │ │ ├──────────────────────────┐ │ │ │ │ │ │ ▼ ▼ │ │ Deterministic checks LLM assessment │ │ (CI, conflicts, (title, description, │ │ unresolved threads) quality criteria) │ │ │ │ │ │ ▼ ▼ │ │ ┌─────────────────────────────────────────────┐ │ │ │ Interactive maintainer session │ │ │ │ │ │ │ │ For each PR, display: │ │ │ │ • PR title, author, age, labels │ │ │ │ • CI status with failure details │ │ │ │ • Merge conflict status │ │ │ │ • Unresolved thread summary │ │ │ │ • LLM assessment (if available) │ │ │ │ • Main-branch failure patterns │ │ │ │ (to distinguish systemic CI failures) │ │ │ │ │ │ │ │ Maintainer chooses action: │ │ │ │ [d]raft [c]omment [x]close [r]erun CI │ │ │ │ [b]rebase [m]ark ready [s]kip [q]uit │ │ │ └─────────────────────────────────────────────┘ │ └─────────────────────────────────────────────────────────────────┘ ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
