This is an automated email from the ASF dual-hosted git repository.

MartijnVisser pushed a commit to branch release-2.3
in repository https://gitbox.apache.org/repos/asf/flink.git

commit 8aba8c9c17c66cbb6d8124b0f7282191e433628f
Author: Martijn Visser <[email protected]>
AuthorDate: Fri Apr 17 10:20:48 2026 +0200

    [FLINK-39477][docs] Add AGENTS.md and AI disclosure to PR template
    
    Give AI coding agents a single entry point for Flink conventions and
    boundaries, and require contributors to disclose AI tooling per the
    ASF generative tooling guidance.
    
    Generated-by: Claude Code 2.1.112 (Claude Opus 4.7, 1M context)
    (cherry picked from commit 55d0cc6d89e2043c4bb53ccc35c83085221cc289)
---
 .github/PULL_REQUEST_TEMPLATE.md |  21 +++
 AGENTS.md                        | 338 +++++++++++++++++++++++++++++++++++++++
 2 files changed, 359 insertions(+)

diff --git a/.github/PULL_REQUEST_TEMPLATE.md b/.github/PULL_REQUEST_TEMPLATE.md
index 2427fd4f01f..99d9ff1db8b 100644
--- a/.github/PULL_REQUEST_TEMPLATE.md
+++ b/.github/PULL_REQUEST_TEMPLATE.md
@@ -72,3 +72,24 @@ This change added tests and can be verified as follows:
 
   - Does this pull request introduce a new feature? (yes / no)
   - If yes, how is the feature documented? (not applicable / docs / JavaDocs / 
not documented)
+
+---
+
+##### Was generative AI tooling used to co-author this PR?
+
+<!--
+If generative AI tooling has been used in the process of authoring this PR, 
please
+change the checkbox below to `[X]` followed by the name of the tool, and 
uncomment the
+"Generated-by" line. See the ASF Generative Tooling Guidance for details:
+https://www.apache.org/legal/generative-tooling.html
+
+You are responsible for the quality and correctness of every change in this PR
+regardless of the tooling used. Low-effort AI-generated PRs will be closed. See
+AGENTS.md for the full guidance.
+-->
+
+- [ ] Yes (please specify the tool below)
+
+<!--
+Generated-by: [Tool Name and Version]
+-->
diff --git a/AGENTS.md b/AGENTS.md
new file mode 100644
index 00000000000..daa34497a66
--- /dev/null
+++ b/AGENTS.md
@@ -0,0 +1,338 @@
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+  http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
+# Flink AI Agent Instructions
+
+This file provides guidance for AI coding agents working with the Apache Flink 
codebase.
+
+## Prerequisites
+
+- Java 11, 17 (default), or 21. Java 11 syntax must be used in all modules. 
Java 17 syntax (records, sealed classes, pattern matching) is only permitted in 
the `flink-tests-java17` module.
+- Maven 3.8.6 (Maven wrapper `./mvnw` included; prefer it)
+- Git
+- Unix-like environment (Linux, macOS, WSL, Cygwin)
+
+## Commands
+
+### Build
+
+- Fast dev build: `./mvnw clean install -DskipTests -Dfast -Pskip-webui-build 
-T1C`
+- Full build (Java 17 default): `./mvnw clean package -DskipTests -Djdk17 
-Pjava17-target`
+- Java 11: `./mvnw clean package -DskipTests -Djdk11 -Pjava11-target`
+- Java 21: `./mvnw clean package -DskipTests -Djdk21 -Pjava21-target`
+- Full build with tests: `./mvnw clean verify`
+- Single module: `./mvnw clean package -DskipTests -pl flink-core-api`
+- Single module with tests: `./mvnw clean verify -pl flink-core-api`
+
+### Testing
+
+- Single test class: `./mvnw -pl flink-core-api -Dtest=MemorySizeTest test`
+- Single test method: `./mvnw -pl flink-core-api 
-Dtest=MemorySizeTest#testParseBytes test`
+
+### Code Quality
+
+- Format code (Java + Scala): `./mvnw spotless:apply`
+- Check formatting: `./mvnw spotless:check`
+- Checkstyle: `./mvnw checkstyle:check -T1C`
+- Checkstyle config: `tools/maven/checkstyle.xml`
+
+## Repository Structure
+
+Every module from the root pom.xml, organized by function. Flink provides 
three main user-facing APIs (recommended in this order: SQL, Table API, 
DataStream API) plus a newer DataStream v2 API.
+
+### Core Infrastructure
+
+- `flink-annotations` — Stability annotations (`@Public`, `@PublicEvolving`, 
`@Internal`, `@Experimental`) and `@VisibleForTesting`
+- `flink-core-api` — Core API interfaces (functions, state, types) shared by 
all APIs
+- `flink-core` — Core implementation (type system, serialization, memory 
management, configuration)
+- `flink-runtime` — Distributed runtime (JobManager, TaskManager, scheduling, 
network, state)
+- `flink-clients` — CLI and client-side job submission
+- `flink-rpc/` — RPC framework
+  - `flink-rpc-core` — RPC interfaces
+  - `flink-rpc-akka`, `flink-rpc-akka-loader` — Pekko-based RPC implementation
+
+### SQL / Table API (recommended API for most users)
+
+- `flink-table/`
+  - `flink-sql-parser` — SQL parser (extends Calcite SQL parser)
+  - `flink-table-common` — Shared types, descriptors, catalog interfaces
+  - `flink-table-api-java` — Table API for Java
+  - `flink-table-api-scala` — Table API for Scala
+  - `flink-table-api-bridge-base`, `flink-table-api-java-bridge`, 
`flink-table-api-scala-bridge` — Bridges between Table and DataStream APIs
+  - `flink-table-api-java-uber` — Uber JAR for Table API
+  - `flink-table-planner` — SQL/Table query planning and optimization 
(Calcite-based)
+  - `flink-table-planner-loader`, `flink-table-planner-loader-bundle` — 
Classloader isolation for planner
+  - `flink-table-runtime` — Runtime operators for Table/SQL queries
+  - `flink-table-calcite-bridge` — Bridge to Apache Calcite
+  - `flink-sql-gateway-api`, `flink-sql-gateway` — SQL Gateway for remote SQL 
execution
+  - `flink-sql-client` — Interactive SQL CLI
+  - `flink-sql-jdbc-driver`, `flink-sql-jdbc-driver-bundle` — JDBC driver for 
SQL Gateway
+  - `flink-table-code-splitter` — Code generation utilities
+  - `flink-table-test-utils` — Test utilities for Table/SQL
+
+### DataStream API (original streaming API)
+
+- `flink-streaming-java` — DataStream API and stream processing operator 
implementations
+
+### DataStream API v2 (newer event-driven API)
+
+- `flink-datastream-api` — DataStream v2 API definitions
+- `flink-datastream` — DataStream v2 API implementation
+
+### Connectors (in-tree)
+
+- `flink-connectors/`
+  - `flink-connector-base` — Base classes for source/sink connectors
+  - `flink-connector-files` — Unified file system source and sink
+  - `flink-connector-datagen` — DataGen source for testing
+  - `flink-connector-datagen-test` — Tests for DataGen connector
+  - `flink-hadoop-compatibility` — Hadoop InputFormat/OutputFormat 
compatibility
+  - `flink-file-sink-common` — Common file sink utilities
+- Most connectors (Kafka, JDBC, Elasticsearch, etc.) live in separate repos 
under [github.com/apache](https://github.com/apache); see README.md for the 
full list
+
+### Formats
+
+- `flink-formats/`
+  - `flink-json`, `flink-csv`, `flink-avro`, `flink-parquet`, `flink-orc`, 
`flink-protobuf` — Serialization formats
+  - `flink-avro-confluent-registry` — Avro with Confluent Schema Registry
+  - `flink-sequence-file`, `flink-compress`, `flink-hadoop-bulk`, 
`flink-orc-nohive` — Hadoop-related formats
+  - `flink-format-common` — Shared format utilities
+  - `flink-sql-json`, `flink-sql-csv`, `flink-sql-avro`, `flink-sql-parquet`, 
`flink-sql-orc`, `flink-sql-protobuf` — SQL-layer format integrations
+  - `flink-sql-avro-confluent-registry` — SQL-layer Avro with Confluent Schema 
Registry
+
+### State Backends
+
+- `flink-state-backends/`
+  - `flink-statebackend-rocksdb` — RocksDB state backend
+  - `flink-statebackend-forst` — ForSt state backend (experimental; a fork of 
RocksDB)
+  - `flink-statebackend-heap-spillable` — Heap-based spillable state backend
+  - `flink-statebackend-changelog` — Changelog state backend
+  - `flink-statebackend-common` — Shared state backend utilities
+- `flink-dstl/flink-dstl-dfs` — State changelog storage (DFS-based persistent 
changelog for incremental checkpointing)
+
+### File Systems
+
+- `flink-filesystems/`
+  - `flink-hadoop-fs` — Hadoop FileSystem abstraction
+  - `flink-s3-fs-hadoop`, `flink-s3-fs-presto`, `flink-s3-fs-base` — S3 file 
systems
+  - `flink-oss-fs-hadoop` — Alibaba OSS
+  - `flink-azure-fs-hadoop` — Azure Blob Storage
+  - `flink-gs-fs-hadoop` — Google Cloud Storage
+  - `flink-fs-hadoop-shaded` — Shaded Hadoop dependencies
+
+### Queryable State
+
+- `flink-queryable-state/`
+  - `flink-queryable-state-runtime` — Server-side queryable state service
+  - `flink-queryable-state-client-java` — Client for querying operator state 
from running jobs
+
+### Deployment
+
+- `flink-kubernetes` — Kubernetes integration
+- `flink-yarn` — YARN integration
+- `flink-dist`, `flink-dist-scala` — Distribution packaging
+- `flink-container` — Container entry-point and utilities for containerized 
deployments
+
+### Metrics
+
+- `flink-metrics/`
+  - `flink-metrics-core` — Metrics API and core implementation
+  - Reporter implementations: `flink-metrics-jmx`, `flink-metrics-prometheus`, 
`flink-metrics-datadog`, `flink-metrics-statsd`, `flink-metrics-graphite`, 
`flink-metrics-influxdb`, `flink-metrics-slf4j`, `flink-metrics-dropwizard`, 
`flink-metrics-otel`
+
+### Libraries
+
+- `flink-libraries/`
+  - `flink-cep` — Complex Event Processing
+  - `flink-state-processing-api` — Offline state access (savepoint 
reading/writing)
+
+### Other
+
+- `flink-models` — AI model integration (sub-modules: `flink-model-openai`, 
`flink-model-triton`)
+- `flink-python` — PyFlink (Python API)
+- `flink-runtime-web` — Web UI for JobManager dashboard
+- `flink-external-resources` — External resource management (e.g., GPU)
+- `docs/` — Documentation content (Hugo site). This is where user-facing docs 
are written.
+- `flink-docs` — Documentation build module (auto-generated config reference 
docs)
+- `flink-examples` — Example programs
+- `flink-quickstart` — Maven archetype for new projects
+- `flink-walkthroughs` — Tutorial walkthrough projects
+
+### Testing
+
+- `flink-tests` — Integration tests
+- `flink-end-to-end-tests` — End-to-end tests
+- `flink-test-utils-parent` — Test utility classes
+- `flink-yarn-tests` — YARN-specific tests
+- `flink-fs-tests` — FileSystem tests
+- `flink-architecture-tests` — ArchUnit architectural boundary tests
+- `tools/ci/flink-ci-tools` — CI tooling
+
+## Architecture Boundaries
+
+1. **Client** submits jobs to the cluster. Submission paths include the CLI 
(`bin/flink run` via `flink-clients`), the SQL Client (`bin/sql-client.sh` via 
`flink-sql-client`), the SQL Gateway (`flink-sql-gateway`, also accessible via 
JDBC driver), the REST API (direct HTTP to JobManager), programmatic execution 
(`StreamExecutionEnvironment.execute()` or `TableEnvironment.executeSql()`), 
and PyFlink (`flink-python`, wraps the Java APIs).
+2. **JobManager** (`flink-runtime`) orchestrates execution: receives jobs, 
creates the execution graph, manages scheduling, coordinates checkpoints, and 
handles failover. Never runs user code directly.
+3. **TaskManager** (`flink-runtime`) executes the user's operators in task 
slots. Manages network buffers, state backends, and I/O.
+4. **Table Planner** (`flink-table-planner`) translates SQL/Table API programs 
into DataStream programs. The planner is loaded in a separate classloader 
(`flink-table-planner-loader`) to isolate Calcite dependencies.
+5. **Connectors** communicate with external systems. Source connectors 
implement the `Source` API (FLIP-27); sinks implement the `Sink` API (package 
`sink2`). Most connectors are externalized to separate repositories.
+6. **State Backends** persist keyed state and operator state. RocksDB is the 
primary backend for production use.
+7. **Checkpointing** provides exactly-once guarantees. The JobManager 
coordinates barriers through the data stream; TaskManagers snapshot local state 
to a distributed file system.
+
+Key separations:
+
+- **Planner vs Runtime:** The table planner generates code and execution 
plans; the runtime executes them. Changes to planning logic live in 
`flink-table-planner`; changes to runtime operators live in 
`flink-table-runtime` or `flink-streaming-java`.
+- **API vs Implementation:** Public API surfaces (`flink-core-api`, 
`flink-datastream-api`, `flink-table-api-java`) are separate from 
implementation modules. API stability annotations control what users can depend 
on.
+- **ArchUnit enforcement:** `flink-architecture-tests/` contains ArchUnit 
tests that enforce module boundaries. New violations should be avoided; if 
unavoidable, follow the freeze procedure in 
`flink-architecture-tests/README.md`.
+
+## Common Change Patterns
+
+This section maps common types of Flink changes to the modules they touch and 
the verification they require.
+
+### Adding a new SQL built-in function
+
+1. Register in `flink-table-common` in `BuiltInFunctionDefinitions.java` 
(definition, input/output type strategies, runtime class reference)
+2. Implement in `flink-table-runtime` under `functions/` (extend the 
appropriate base class: `BuiltInScalarFunction`, `BuiltInTableFunction`, 
`BuiltInAggregateFunction`, or `BuiltInProcessTableFunction`)
+3. Add tests in `flink-table-planner` and `flink-table-runtime`
+4. Extend Table API support
+5. Document in `docs/`
+6. See 
[flink-table/flink-table-planner/AGENTS.md](flink-table/flink-table-planner/AGENTS.md)
 and 
[flink-table/flink-table-runtime/AGENTS.md](flink-table/flink-table-runtime/AGENTS.md)
 for detailed patterns
+
+### Adding a new configuration option
+
+1. Define `ConfigOption<T>` in the relevant config class (e.g., 
`ExecutionConfigOptions.java` in `flink-table-api-java`)
+2. Use `ConfigOptions.key("table.exec....")` builder with type, default value, 
and description
+3. Add `@Documentation.TableOption` annotation for auto-generated docs
+4. Document in `docs/` if user-facing
+5. Verify: unit test for default value, ITCase for behavior change
+
+### Adding a new table operator (e.g., join type, aggregate)
+
+1. Involves `flink-table-runtime` (operator), `flink-table-planner` (ExecNode, 
physical/logical rules), and tests across both
+2. See 
[flink-table/flink-table-planner/AGENTS.md](flink-table/flink-table-planner/AGENTS.md)
 and 
[flink-table/flink-table-runtime/AGENTS.md](flink-table/flink-table-runtime/AGENTS.md)
 for detailed development order and testing patterns
+
+### Adding a new connector (Source or Sink)
+
+1. Implement the `Source` API (`flink-connector-base`): `SplitEnumerator`, 
`SourceReader`, `SourceSplit`, serializers (`SimpleVersionedSerializer`)
+2. Or implement the `Sink` API (package `sink2`) for sinks
+3. Most new connectors go in separate repos under `github.com/apache`, not in 
the main Flink repo
+4. Verify: unit tests + ITCase with real or embedded external system
+
+### Modifying state serializers
+
+1. Changes to `TypeSerializer` require a corresponding 
`TypeSerializerSnapshot` for migration
+2. Bump version in `getCurrentVersion()`, handle old versions in 
`readSnapshot()`
+3. Snapshot must have no-arg constructor for reflection-based deserialization
+4. Implement `resolveSchemaCompatibility()` for upgrade paths
+5. Verify: serializer snapshot migration tests, checkpoint restore tests 
across versions
+
+### Introducing or changing user-facing APIs (`@Public`, `@PublicEvolving`, 
`@Experimental`)
+
+1. New user-facing API requires a voted FLIP (Flink Improvement Proposal); 
this applies to `@Public`, `@PublicEvolving`, and `@Experimental` since users 
build against all three
+2. Every user-facing API class and method must carry a stability annotation
+3. Changes to existing `@Public` or `@PublicEvolving` API must maintain 
backward compatibility
+4. `@Internal` APIs can be changed freely; users should not depend on them
+5. Update JavaDoc on the changed class/method
+6. Add to release notes
+7. Verify: ArchUnit tests pass, no new architecture violations
+
+## Coding Standards
+
+- **Format Java files with Spotless immediately after editing:** `./mvnw 
spotless:apply`. Uses google-java-format with AOSP style.
+- **Scala formatting:** Spotless + scalafmt (config at `.scalafmt.conf`, 
maxColumn 100).
+- **Checkstyle:** `tools/maven/checkstyle.xml` (version defined in root 
`pom.xml` as `checkstyle.version`). Some modules (flink-core, flink-optimizer, 
flink-runtime) are not covered by checkstyle enforcement, but conventions 
should still be followed.
+- **No new Scala code.** All Flink Scala APIs are deprecated per FLIP-265. 
Write all new code in Java.
+- **Apache License 2.0 header** required on all new files (enforced by Apache 
Rat). Use an HTML comment for markdown files.
+- **API stability annotations:** Every user-facing API class and method must 
have a stability annotation. `@Public` (stable across minor releases), 
`@PublicEvolving` (may change in minor releases), `@Experimental` (may change 
at any time). These are all part of the public API surface that users build 
against. `@Internal` marks APIs with no stability guarantees that users should 
not depend on.
+- **Logging:** Use parameterized log statements (SLF4J `{}` placeholders), 
never string concatenation.
+- **No Java serialization** for new features (except internal RPC message 
transport).
+- **Use `final`** for variables and fields where applicable.
+- **Comments:** Do not add unnecessary comments that restate what the code 
does. Add comments that explain "the why" where relevant.
+- **Reuse existing code.** Before implementing new utilities or abstractions, 
search for existing ones in the codebase. Prioritize architecture consistency 
and code reusability.
+- Full code style guide: 
https://flink.apache.org/how-to-contribute/code-style-and-quality-preamble/
+
+## Testing Standards
+
+- Add tests for new behavior, covering success, failure, and edge cases.
+- Use **JUnit 5** + **AssertJ** assertions. Do not use JUnit 4 or Hamcrest in 
new test code.
+- Prefer real test implementations over Mockito mocks where possible.
+- **Integration tests:** Name classes with `ITCase` suffix (e.g., 
`MyFeatureITCase.java`).
+- **Red-green verification:** For bug fixes, verify that new tests actually 
fail without the fix before confirming they pass with it.
+- **Test location** mirrors source structure within each module.
+- Follow the testing conventions at 
https://flink.apache.org/how-to-contribute/code-style-and-quality-common/#7-testing
+
+## Commits and PRs
+
+### Commit message format
+
+- `[FLINK-XXXX][component] Description` where FLINK-XXXX is the JIRA issue 
number
+- `[hotfix][component] Description` for typo fixes without JIRA
+- Each commit must have a meaningful message including the JIRA ID. If you 
don't know the ticket number, ask.
+- Separate cleanup/refactoring from functional changes into distinct commits
+- When AI tools were used: add `Generated-by: <Tool Name and Version>` trailer 
per [ASF generative tooling 
guidance](https://www.apache.org/legal/generative-tooling.html)
+
+### Pull request conventions
+
+- Title format: `[FLINK-XXXX][component] Title of the pull request`
+- A corresponding JIRA issue is required (except hotfixes for typos)
+- Fill out the PR template completely but concisely: describe purpose, change 
log, testing approach, impact assessment
+- Each PR should address exactly one issue
+- Ensure `./mvnw clean verify` passes before opening a PR
+- Always push to your fork, not directly to `apache/flink`
+- Rebase onto the latest target branch before submitting
+
+### AI-assisted contributions
+
+- Disclose AI usage by checking the AI disclosure checkbox and uncommenting 
the `Generated-by` line in the PR template
+- Add `Generated-by: <Tool Name and Version>` to commit messages
+- Never add `Co-Authored-By` with an AI agent as co-author; agents are 
assistants, not authors
+- You must be able to explain the design, code, and tests, debug them, and 
respond to review feedback substantively
+- Reviewer-ready quality bar: the author owns PR quality. PRs that look 
AI-generated without author refinement (walls of unreviewed prose, scaffolding 
without behaviour, tests that do not exercise the change, padded commit 
messages) will be closed without review
+
+## Boundaries
+
+### Ask first
+
+- Adding or changing `@Public`, `@PublicEvolving`, or `@Experimental` 
annotations (these are user-facing API commitments requiring a FLIP)
+- Large cross-module refactors
+- New dependencies
+- Changes to serialization formats (affects state compatibility)
+- Changes to checkpoint/savepoint behavior
+- Changes that could impact performance on hot paths (per-record processing, 
serialization, state access)
+
+### Never
+
+- Commit secrets, credentials, or tokens
+- Push directly to `apache/flink`; always work from your fork
+- Mix unrelated changes into one PR
+- Use Java serialization for new features
+- Edit generated files by hand when a generation workflow exists
+- Use the legacy `SourceFunction` or `SinkFunction` interfaces for connectors; 
use the `Source` API (FLIP-27) and `Sink` API (package `sink2`) instead
+- Add `Co-Authored-By` with an AI agent as co-author in commit messages; AI 
agents are assistants, not authors. Use `Generated-by: <Tool Name and Version>` 
instead.
+- Suppress or bypass checkstyle rules (no `CHECKSTYLE:ON`/`CHECKSTYLE:OFF` 
comments, no adding entries to `tools/maven/suppressions.xml`, no 
`@SuppressWarnings`). Fix the code to satisfy checkstyle instead.
+- Add, change, or remove classes outside the `org.apache.flink.*` package (for 
example, classes copied from Calcite)
+- Modify `Parser.jj` (Calcite's generated parser grammar; expected to be 
removed in future Calcite upgrades)
+- Use destructive git operations unless explicitly requested
+
+## References
+
+- [README.md](README.md) — Build instructions and project overview
+- [DEVELOPMENT.md](DEVELOPMENT.md) — IDE setup and development environment
+- [.github/CONTRIBUTING.md](.github/CONTRIBUTING.md) — Contribution process
+- [.github/PULL_REQUEST_TEMPLATE.md](.github/PULL_REQUEST_TEMPLATE.md) — PR 
checklist
+- [Code Style 
Guide](https://flink.apache.org/how-to-contribute/code-style-and-quality-preamble/)
 — Detailed coding guidelines
+- [ASF Generative Tooling 
Guidance](https://www.apache.org/legal/generative-tooling.html) — AI tooling 
policy

Reply via email to