This is an automated email from the ASF dual-hosted git repository.
bobbai00 pushed a commit to branch release/v1.1.0-incubating
in repository https://gitbox.apache.org/repos/asf/texera.git
The following commit(s) were added to refs/heads/release/v1.1.0-incubating by
this push:
new 86d6ea0644 docs(1.1): backport AGENTS.md (#4549, #4825) (#4939)
86d6ea0644 is described below
commit 86d6ea06440e1ccb2894ae390b02b999c04bcdee
Author: Jiadong Bai <[email protected]>
AuthorDate: Mon May 4 21:18:30 2026 -0700
docs(1.1): backport AGENTS.md (#4549, #4825) (#4939)
### What changes were proposed in this PR?
Backport #4549 (initial AGENTS.md + CLAUDE.md) and #4825 (AGENTS.md
rewrite) to `release/v1.1.0-incubating`. Two cherry-picks: #4549 applies
clean; #4825 has trivial content conflicts that resolve by taking the
post-rewrite state (i.e. `upstream/main:AGENTS.md` verbatim — what the
squashed result of both PRs would produce on main).
### Any related issues, documentation, discussions?
Backport of #4549 and #4825. Unblocks the auto-backport simulation for
#4938 (Java 17 bump) and any future PR that touches `AGENTS.md`.
### How was this PR tested?
Doc-only change. Verified `diff <(git show HEAD:AGENTS.md) <(git show
upstream/main:AGENTS.md)` is empty.
### Was this PR authored or co-authored using generative AI tooling?
Generated-by: Claude Code (Opus 4.7)
---------
Co-authored-by: Xinyuan Lin <[email protected]>
Co-authored-by: Yicong Huang
<[email protected]>
Co-authored-by: Claude Opus 4.7 (1M context) <[email protected]>
---
AGENTS.md | 185 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
CLAUDE.md | 3 +
2 files changed, 188 insertions(+)
diff --git a/AGENTS.md b/AGENTS.md
new file mode 100644
index 0000000000..cb1cd0fdb3
--- /dev/null
+++ b/AGENTS.md
@@ -0,0 +1,185 @@
+# AGENTS.md
+
+## Architecture Map
+
+Apache Texera: Scala/sbt backend services + the Amber workflow execution
+engine, an Angular UI, and the agent service. JVM modules wired in
+[`build.sbt`](build.sbt).
+
+| Area | Path | Detail |
+| --- | --- | --- |
+| Workflow execution engine (Amber) | `amber/` |
[amber/README.md](amber/README.md) |
+| Backend services | `config-service/`, `access-control-service/`,
`file-service/`, `computing-unit-managing-service/`,
`workflow-compiling-service/` | `build.sbt` |
+| Shared Scala libs | `common/` (`auth`, `config`, `dao`, `workflow-core`,
`workflow-operator`, `pybuilder`) | `build.sbt` |
+| Frontend (Angular) | `frontend/` | [frontend/README.md](frontend/README.md) |
+| Agent service (Bun/TS, LLM agents) | `agent-service/` |
`agent-service/package.json` |
+| Pyright language service | `pyright-language-service/` |
[pyright-language-service/README.md](pyright-language-service/README.md) |
+| Deploy scripts / Dockerfiles | `bin/` | [README](bin/README.md) /
[k8s](bin/k8s/README.md) / [single-node](bin/single-node/README.md) |
+| DDL, sbt plugins | `sql/`, `project/` | files therein |
+
+### Amber breakdown
+
+| Path | Role |
+| --- | --- |
+| `amber/src/main/scala` | Pekko actors, scheduler, reconfiguration, fault
tolerance, gRPC/proto |
+| `amber/src/main/python/pyamber` | Python engine (`pyamber`) — bridge to the
Scala engine |
+| `amber/src/main/python/pytexera` | Python operator SDK exposed to UDFs |
+
+## Where Things Live
+
+| Topic | Source of truth |
+| --- | --- |
+| Contribution / PR / lint / format / testing / license header |
[CONTRIBUTING.md](CONTRIBUTING.md) |
+| Reporting security issues | [SECURITY.md](SECURITY.md) |
+| PR template | [.github/PULL_REQUEST_TEMPLATE](.github/PULL_REQUEST_TEMPLATE)
|
+| Issue templates | [bug](.github/ISSUE_TEMPLATE/bug-template.yaml) /
[task](.github/ISSUE_TEMPLATE/task-template.yaml) /
[feature](.github/ISSUE_TEMPLATE/feature-template.yaml) |
+| License-header coverage; vendored `workflow-operator` |
[.licenserc.yaml](.licenserc.yaml);
[project/AddMetaInfLicenseFiles.scala](project/AddMetaInfLicenseFiles.scala) |
+| Local single-node / k8s deploy | [single-node](bin/single-node/README.md),
[k8s](bin/k8s/README.md) |
+
+If a topic is above, **read that file** instead of asking here.
+
+## Agent-Specific Rules
+
+### Scope and safety
+
+- Narrowly scoped changes. No unrelated rewrites or cross-service moves.
+- `git status --short` before editing; don't revert unrelated dirty files.
+- Never commit secrets / local config / build output / caches / binaries
+ (`python_udf.conf`, `.env`, `target/`, `dist/`, `.pytest_cache/`,
+ `.ruff_cache/`, logs).
+
+### Develop in a worktree
+
+Leave `texera/` on `main`. One worktree per PR, branched off a freshly
+fetched `upstream/main`.
+
+```
+texera/ # stays on main, never dirty
+texera-worktrees/<branch>/ # one worktree per PR
+```
+
+Reset to `upstream/main` at start; `git log upstream/main..HEAD` should
+contain only this PR's commits before pushing; remove the worktree after
+merge.
+
+### Environment
+
+| Component | Version |
+| --- | --- |
+| Java | JDK 11 |
+| Scala | 2.13 |
+| Python | 3.12 |
+| Node | 24 |
+
+One Python venv shared across worktrees, sibling of the texera checkout:
+
+```
+<workspace>/
+├── texera/ # main checkout
+├── texera-worktrees/<br>/ # per-PR worktrees
+└── venv312/ # shared Python 3.12 venv
+```
+
+```bash
+python3.12 -m venv ../venv312 && source ../venv312/bin/activate
+pip install -r amber/requirements.txt -r amber/operator-requirements.txt
+```
+
+Tests that spawn Python workers need an interpreter path. Edit `python.path`
+in [`udf.conf`](common/config/src/main/resources/udf.conf) or
+`export UDF_PYTHON_PATH="$(pwd)/../venv312/bin/python"` (env var overrides).
+Without it, `sbt` Python-integration tests fail to launch a worker.
+
+### Branch and commit naming
+
+Short, **Conventional Commits**, same shape for branch and commit subject.
+
+| Kind | Branch | Commit |
+| --- | --- | --- |
+| Feature | `feat/agent-workflow-edit` | `feat(agent-service): enable workflow
edit` |
+| Bug fix | `fix/marker-replay` | `fix(amber): marker replay during
reconfiguration` |
+| Tests | `test/pyamber-handlers` | `test(pyamber): add handler unit tests` |
+| Chore | `chore/angular-21` | `chore(deps): upgrade frontend to Angular 21` |
+| CI | `ci/cache-action-bump` | `ci: bump coursier/cache-action to v8.1.0` |
+
+Both ≤ ~60 chars. For code changes, if you use a scope, use the module name
+(`amber`, `pyamber`, `frontend`, `agent-service`, `file-service`, …) — not
+`amber-python`. Use `chore(deps): ...` for dependency-only updates, and
+`ci: ...` for CI-only changes. No `Co-authored-by:` trailer for the repo
+owner.
+
+### Issues and PRs
+
+Issue-first; both stay short.
+
+```
+issue (template + Type) -> PR (Closes #N, template) -> review -> merge
+```
+
+- Every change starts as an issue (minor typo / docs excepted). File against
+ `apache/texera`, never a fork.
+- Pick the right template **and** set the GitHub Issue **Type** explicitly
+ (`Bug` / `Task` / `Feature`); the template's `type:` frontmatter doesn't
+ always apply on creation.
+- Reference the issue: `Closes #N` (or `Fixes` / `Resolves`, or "related to").
+- Issue titles are **plain prose**; never use the Conventional Commits
+ format (`type(scope): ...`) — that prefix is for commit and PR titles only.
+- Task issues match `task-template.yaml` exactly.
+- Prefer **tables** and small **ASCII diagrams** over long bullets. Don't
+ restate the diff or the template.
+- For bugs, lead with **root cause** and a **before -> after** sketch:
+ ```
+ Before: reconfiguration -> replay marker -> worker hangs
+ After: reconfiguration -> replay marker -> resume from checkpoint
+ ```
+- **Frontend PRs**: any visible UI change requires screenshots / GIF,
+ **before / after** side by side. For purely visual fixes that's the
+ primary verification under "How was this PR tested?"; interactive flows
+ also list manual steps (click path, browser, viewport).
+
+### Tests come first
+
+TDD. Write the test before the source change.
+
+```
+write/adjust test (red) -> edit source (green) -> refactor
+```
+
+| Situation | Order |
+| --- | --- |
+| New feature / behavior change | Failing test, then implement. |
+| Bug fix | Regression test reproducing the bug, then fix. |
+| Code with **no tests** | **Characterization tests** pin current behavior
first; only then change source. |
+| Refactor (no behavior change) | Tests stay green throughout — no assertion
edits. |
+
+Every test must cover:
+
+- **Both directions**: positive (valid → expected) **and** negative (invalid
+ / error → specific failure mode).
+- **Edge cases**: empty / null / zero / max / boundary, unicode,
+ concurrency/order, missing or malformed config.
+- **Don't assume valid.** External input (user / API / file / message) must
+ be tested with bad input.
+
+Don't claim "tested" without commands. Paste the exact `sbt testOnly` /
+`pytest` / `yarn test:ci` / `bun test` invocation under "How was this PR
+tested?".
+
+### CI labels & gating
+
+CI runs are **selected by PR labels**, not by file diff.
+
+```
+diff -> pr-labeler -> labels on PR -> required-checks maps labels to stacks ->
CI runs
+```
+
+- Path → label rules: [`.github/labeler.yml`](.github/labeler.yml)
+- Label → stacks (`LABEL_STACKS`, source of truth):
+
[`.github/workflows/required-checks.yml`](.github/workflows/required-checks.yml).
+ Read it directly; don't duplicate the mapping here.
+- Need extra coverage the diff doesn't imply (e.g. a `common/` change you
+ suspect breaks the frontend)? **Add the relevant label manually**.
+- Empty stack union (docs-only / dev-only / `dependencies` / `feature` /
+ `fix` / `refactor` / `release/*` only) skips every build stack on purpose.
+- `release/*` labels select backport targets; removing one cancels that
+ backport.
diff --git a/CLAUDE.md b/CLAUDE.md
new file mode 100644
index 0000000000..5c64d0f0e3
--- /dev/null
+++ b/CLAUDE.md
@@ -0,0 +1,3 @@
+# CLAUDE.md
+
+Use the project guidance in [AGENTS.md](AGENTS.md).