This is an automated email from the ASF dual-hosted git repository.
janhoy pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/solr-orbit-workloads.git
The following commit(s) were added to refs/heads/main by this push:
new 4fc9bbc SOLR-18258 Rebrand the product to Solr Orbit (#1)
4fc9bbc is described below
commit 4fc9bbc3af155185477e0bd809027ea48167162f
Author: Jan Høydahl <[email protected]>
AuthorDate: Fri May 22 02:17:19 2026 +0200
SOLR-18258 Rebrand the product to Solr Orbit (#1)
---
AGENTS.md | 158 ++++++++++++++++++++++++++++++
CONTRIBUTING.md | 36 +++----
INCUBATION_TODO.md | 4 +-
README.md | 16 +--
geonames/configsets/geonames/schema.xml | 2 +-
nyc_taxis/README.md | 10 +-
nyc_taxis/TEST_PROCEDURES.md | 2 +-
nyc_taxis/configsets/nyc_taxis/schema.xml | 2 +-
8 files changed, 194 insertions(+), 36 deletions(-)
diff --git a/AGENTS.md b/AGENTS.md
new file mode 100644
index 0000000..4b9d18f
--- /dev/null
+++ b/AGENTS.md
@@ -0,0 +1,158 @@
+# AGENTS.md — Apache Solr Orbit Workloads
+
+This file provides guidance for AI coding agents (Claude Code, Codex, Cursor,
etc.) working in
+this repository. Read it before making any changes.
+
+## Project overview
+
+This repository contains the default workload specifications for
+[Apache Solr Orbit](https://github.com/apache/solr-orbit), the
macrobenchmarking framework for
+[Apache Solr](https://solr.apache.org/). It is an **Apache Software Foundation
(ASF) project**
+governed by the Solr PMC.
+
+Each workload is a self-contained directory that defines:
+- the Solr collection schema (`configsets/`)
+- the data corpus references (`workload.json`, `files.txt`)
+- the named operations (`operations/default.json`)
+- the test procedures (`test_procedures/default.json`)
+
+## Apache Software Foundation rules
+
+This is an ASF project. The following rules are **non-negotiable**:
+
+- Every source file must carry the standard Apache License 2.0 header. Do not
remove or alter
+ existing license headers.
+- Do not add dependencies, tools, or services that are not ASF-compatible
(Category A or B
+ licenses only; GPL, AGPL, SSPL, and similar are forbidden).
+- All commits must represent the contributor's own work or work they have the
right to submit
+ under Apache License 2.0 (DCO / ICLA obligations).
+- Do not commit secrets, credentials, or personally identifiable information
(PII) of any kind.
+- The canonical project mailing list is **[email protected]**. Significant
decisions belong
+ there, not in code comments or commit messages.
+- Bug reports and feature requests live in
+ [GitHub Issues](https://github.com/apache/solr-orbit-workloads/issues).
+
+## Dataset attribution and licensing
+
+Each dataset bundled in a workload carries its own licence. Agents **must
not** strip, alter,
+or omit these attributions.
+
+### Adding a new workload dataset
+
+Before introducing a new dataset, verify:
+1. It contains no PII and no proprietary data.
+2. You hold, or have obtained in writing, the rights to redistribute it.
+3. Its licence is documented verbatim in the workload's `README.md`.
+4. The licence is ASF-compatible (CC-BY, public-domain, ODbL, and similar are
fine; CC-NC or
+ CC-ND variants are **not** acceptable).
+
+## Repository structure
+
+```
+common_operations/ # Shared operation snippets (collection lifecycle)
+ create_collection.json
+ delete_collection.json
+ check_cluster_health.json
+ optimize.json
+
+<workload>/ # One directory per workload (e.g. geonames/,
nyc_taxis/)
+ workload.json # Collections, corpora, operations refs, test
procedure refs
+ workload.py # (optional) Dynamic workload logic
+ files.txt # List of corpus data files
+ README.md # Required — see "Contributing a workload" section
+ configsets/<name>/ # Solr configset: schema.xml + solrconfig.xml
+ operations/
+ default.json # Named operations referenced by test procedures
+ test_procedures/
+ default.json # At least one procedure must be marked "default":
true
+ common_operations/ # (optional) Workload-local operation overrides
+```
+
+Reuse the root-level `common_operations/` snippets for collection lifecycle
steps. Do not
+duplicate `create_collection`, `delete_collection`, `optimize`, or
`check_cluster_health`
+inside individual workloads.
+
+## Branching model
+
+| Branch | Purpose |
+|--------|---------|
+| `main` | Default; target for changes that apply to all Solr versions |
+| `10`, `9`, … | Solr major-version branches; cherry-pick from `main` as
needed |
+
+`solr-orbit` automatically selects the branch matching the Solr version under
test, falling
+back to `main`. Use `--workload-revision` to pin an explicit branch.
+
+When backporting: `main → 10 → 9`. If a cherry-pick conflicts, open a separate
PR targeting
+the version branch directly.
+
+## How to test changes
+
+There is no local test runner in this repository. Validation is done through
the `solr-orbit`
+CLI (from [apache/solr-orbit](https://github.com/apache/solr-orbit)).
+
+### Quick sanity check (local Solr required)
+
+```bash
+# Test a modified workload using a local path
+solr-orbit run \
+ --pipeline=benchmark-only \
+ --target-host=localhost:8983 \
+ --workload-path=/path/to/your/fork/<workload> \
+ --test-mode
+
+# Or use Docker to provision Solr automatically
+solr-orbit run \
+ --pipeline=docker \
+ --distribution-version=9.10.1 \
+ --workload=nyc_taxis \
+ --test-mode
+```
+
+`--test-mode` reduces corpus size and iteration counts for fast feedback.
Always run it before
+claiming a change works.
+
+### Full benchmark (required before opening a PR for a new workload)
+
+```bash
+solr-orbit run \
+ --pipeline=benchmark-only \
+ --target-host=localhost:8983 \
+ --workload-path=/path/to/your/workload
+```
+
+Include the result summary in the PR description.
+
+## Code and file conventions
+
+- **workload.json**: uses Jinja2 templating. Prefer `zstd` compression for
corpora; include a
+ `bz2` fallback via the conditional already established in existing workloads.
+- **schema.xml / solrconfig.xml**: must carry the full ASF license header (see
existing files
+ as the template).
+- **JSON files**: 2-space indentation, no trailing commas (standard JSON).
+- Do not add comments to JSON files — JSON does not support comments and
`solr-orbit` will
+ reject malformed input.
+- **README.md for a workload** must include: workload purpose, example
document, parameters
+ table, test procedures table, sample output, and dataset licence. See
+ `nyc_taxis/README.md` as the reference implementation.
+
+## Pull request checklist
+
+Before opening a PR, confirm:
+
+- [ ] `--test-mode` run completes cleanly against at least one supported Solr
version.
+- [ ] For new workloads: a full (non-test-mode) benchmark was run; output is
in the PR description.
+- [ ] Dataset licence is documented in the workload README and is
ASF-compatible.
+- [ ] No PII, credentials, or non-ASF-licensed code is introduced.
+- [ ] ASF license header is present on every new source file.
+- [ ] The PR description states which branches need backporting.
+
+## What agents should avoid
+
+- Do not edit `files.txt` unless you have actually verified the corpus file
list.
+- Do not change `base-url` values in `workload.json` — corpus hosting is
managed by the
+ maintainers and is pending migration to ASF infrastructure (see
`INCUBATION_TODO.md`).
+- Do not introduce Python dependencies in `workload.py` beyond the standard
library and what
+ `solr-orbit` already provides.
+- Do not modify the branch protection or merge strategy configured in
`.asf.yml` without
+ explicit maintainer approval.
+- Do not open issues, PRs, or send emails on behalf of the user without
explicit instruction.
diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md
index 8ac81c1..136000f 100644
--- a/CONTRIBUTING.md
+++ b/CONTRIBUTING.md
@@ -1,7 +1,7 @@
# Contributor Guidelines
This repository contains the default workload specifications for
-[Apache Solr Benchmark](https://github.com/janhoy/solr-benchmark).
+[Apache Solr Orbit](https://github.com/apache/solr-orbit).
This document is a guide on best practices for contributing to this repository.
## Contents
@@ -28,7 +28,7 @@ This document is a guide on best practices for contributing
to this repository.
This repository uses major version branches named after the Solr major version
number (e.g.
`9`, `10`). The `main` branch is the default.
-When running a benchmark, `solr-benchmark` automatically selects the workload
branch that
+When running a benchmark, `solr-orbit` automatically selects the workload
branch that
matches the Solr version being tested. For example, benchmarking a Solr 10.X.X
cluster will
use the `10` branch if it exists, falling back to `main` otherwise. To
cherry-pick your
workload changes to the right branch, base that on the major version of the
cluster you intend
@@ -63,16 +63,16 @@ Before making a change, fork this repository and make the
change on a feature br
## Test changes
After making changes in your feature branch, test them locally and optionally
via GitHub Actions
-integration tests in your forked `solr-benchmark` repository.
+integration tests in your forked `solr-orbit` repository.
### Testing changes locally
1. Start a local Solr cluster to test against (standalone or SolrCloud).
-2. Run `solr-benchmark` pointing at your modified workload using
`--workload-path` or
+2. Run `solr-orbit` pointing at your modified workload using `--workload-path`
or
`--workloads-repository`. Use `--test-mode` for a quick sanity-check run:
```bash
-solr-benchmark run \
+solr-orbit run \
--pipeline=benchmark-only \
--target-host=localhost:8983 \
--workload-path=/path/to/your/fork/nyc_taxis \
@@ -84,24 +84,24 @@ solr-benchmark run \
Additional tips:
- `--test-mode` reduces the corpus size and iteration counts so the run
finishes quickly.
- To enforce a specific workloads branch from a remote repository, pass
- `--workloads-repository=https://github.com/<YOUR
USERNAME>/solr-benchmark-workloads` and
- `--distribution-version=X.Y.Z` to pin `solr-benchmark` to the matching
branch.
+ `--workloads-repository=https://github.com/<YOUR
USERNAME>/solr-orbit-workloads` and
+ `--distribution-version=X.Y.Z` to pin `solr-orbit` to the matching branch.
### Testing changes with integration tests
To catch regressions across the full suite, run integration tests from your
forked
-`solr-benchmark` repository.
+`solr-orbit` repository.
**One-time setup:**
-1. Fork [solr-benchmark](https://github.com/janhoy/solr-benchmark).
+1. Fork [solr-orbit](https://github.com/apache/solr-orbit).
2. In your fork, create a branch called `test-forked-workloads` based off
`main`.
3. In that branch, update the integration test configuration to point at your
forked workloads
repository:
```ini
[workloads]
-default.url = https://github.com/<YOUR GITHUB
USERNAME>/solr-benchmark-workloads
+default.url = https://github.com/<YOUR GITHUB USERNAME>/solr-orbit-workloads
```
4. Push that branch to your fork.
@@ -111,7 +111,7 @@ default.url = https://github.com/<YOUR GITHUB
USERNAME>/solr-benchmark-workloads
1. Cherry-pick your workload change(s) onto the relevant branches of your
forked workloads
repository.
2. Push those branches.
-3. In your forked `solr-benchmark` repository, go to **GitHub Actions → Run
Integration Tests**,
+3. In your forked `solr-orbit` repository, go to **GitHub Actions → Run
Integration Tests**,
select the `test-forked-workloads` branch, and click **Run workflow**.
4. Verify that all tests pass.
@@ -130,7 +130,7 @@ Before opening a pull request, make sure you have addressed
the following:
behaviour, tag a subject-matter expert.
Create a pull request from your fork to the
-[`main` branch of this
repository](https://github.com/janhoy/solr-benchmark-workloads).
+[`main` branch of this
repository](https://github.com/apache/solr-orbit-workloads).
## Reviewing pull-requests
@@ -160,7 +160,7 @@ included in the backport PR.
## Contributing a workload
-See the [Apache Solr Benchmark documentation
site](https://janhoy.github.io/solr-benchmark/)
+See the [Apache Solr Orbit documentation
site](https://apache.github.io/solr-orbit/)
for the full workload specification reference, including operation types,
Jinja2 templating,
and test procedure format.
@@ -178,7 +178,7 @@ A new workload must provide:
- `workload.json` — defining `collections`, `corpora`, `operations`, and
`test_procedures`
- `configsets/<name>/` — a valid Solr configset (`schema.xml` +
`solrconfig.xml`). If no
- configset is provided, Apache Solr Benchmark will attempt to auto-generate a
basic schema
+ configset is provided, Apache Solr Orbit will attempt to auto-generate a
basic schema
from the document structure, but an explicit configset is strongly
recommended for
benchmarking accuracy.
- `operations/default.json` — the named operations referenced by test
procedures
@@ -200,10 +200,10 @@ Provide a detailed `README.md` that includes:
- The workload parameters that can be used to customize the workload.
- A list of default and available test procedures.
- A sample of the console output produced after a successful test run.
-- The open-source licence that gives users and Apache Solr Benchmark
permission to use the
+- The open-source licence that gives users and Apache Solr Orbit permission to
use the
dataset.
-For an example, see the [`nyc_taxis`
README](https://github.com/janhoy/solr-benchmark-workloads/blob/main/nyc_taxis/README.md).
+For an example, see the [`nyc_taxis`
README](https://github.com/apache/solr-orbit-workloads/blob/main/nyc_taxis/README.md).
### Testing a new workload
@@ -213,7 +213,7 @@ All test runs used to produce example output must target a
live Apache Solr clus
end-to-end pass:
```bash
- solr-benchmark run \
+ solr-orbit run \
--pipeline=benchmark-only \
--target-host=localhost:8983 \
--workload-path=/path/to/your/workload \
@@ -233,4 +233,4 @@ that other users can download them.
For questions, reach out on the
[[email protected]](https://lists.apache.org/[email protected])
mailing list or
-open a [GitHub
issue](https://github.com/janhoy/solr-benchmark-workloads/issues).
+open a [GitHub issue](https://github.com/apache/solr-orbit-workloads/issues).
diff --git a/INCUBATION_TODO.md b/INCUBATION_TODO.md
index 2a23ad4..110f696 100644
--- a/INCUBATION_TODO.md
+++ b/INCUBATION_TODO.md
@@ -3,10 +3,10 @@
Tasks to complete as part of donating this repository to the Apache Software
Foundation
under the Solr PMC.
-- [ ] Rename the project
+- [x] Rename the project
- [ ] Complete the [ASF IP
Clearance](https://incubator.apache.org/ip-clearance/) process
(conducted by the Solr PMC)
-- [ ] Move repository to the `apache/` GitHub organisation and update all
internal links
+- [x] Move repository to the `apache/` GitHub organisation and update all
internal links
from `janhoy/` to `apache/` and proper link to docs site
- [ ] Migrate corpus data files to ASF-managed or community-controlled storage
and update
`base-url` in both workload definitions (see FIXME in `README.md`)
diff --git a/README.md b/README.md
index 7b7a7cd..fba6634 100644
--- a/README.md
+++ b/README.md
@@ -1,11 +1,11 @@
-Apache Solr Benchmark Workloads
---------------------------------
+Apache Solr Orbit Workloads
+---------------------------
[](LICENSE)
[](CONTRIBUTING.md)
This repository contains the default workload specifications for
-[Apache Solr Benchmark](https://github.com/janhoy/solr-benchmark),
+[Apache Solr Orbit](https://github.com/apache/solr-orbit),
the macrobenchmarking framework for [Apache Solr](https://solr.apache.org/).
You do not need to interact with this repository directly unless you want to
inspect existing
@@ -15,20 +15,20 @@ workloads, run benchmarks with a custom workload, or
contribute a new workload.
Full documentation — including how to run workloads, workload structure,
operation types,
parameters, and how to write custom workloads — is available on the
-**[Apache Solr Benchmark documentation
site](https://janhoy.github.io/solr-benchmark/)**.
+**[Apache Solr Orbit documentation
site](https://apache.github.io/solr-orbit/)**.
## Quick start
```bash
# Run the default nyc_taxis workload against a local Solr cluster
-solr-benchmark run \
+solr-orbit run \
--pipeline=benchmark-only \
--target-host=localhost:8983 \
--workload=nyc_taxis \
--test-mode
# Or provision Solr via Docker and benchmark
-solr-benchmark run \
+solr-orbit run \
--pipeline=docker \
--distribution-version=9.10.1 \
--workload=nyc_taxis \
@@ -50,8 +50,8 @@ and contribute a new workload.
## Getting help
- Questions and discussion:
[[email protected]](https://lists.apache.org/[email protected])
-- Bug reports and feature requests: [GitHub
Issues](https://github.com/janhoy/solr-benchmark-workloads/issues)
-- Benchmark tool documentation: [Apache Solr Benchmark
docs](https://janhoy.github.io/solr-benchmark/)
+- Bug reports and feature requests: [GitHub
Issues](https://github.com/apache/solr-orbit-workloads/issues)
+- Benchmark tool documentation: [Apache Solr Orbit
docs](https://apache.github.io/solr-orbit/)
## Data hosting
diff --git a/geonames/configsets/geonames/schema.xml
b/geonames/configsets/geonames/schema.xml
index 9b24972..0b5f07d 100644
--- a/geonames/configsets/geonames/schema.xml
+++ b/geonames/configsets/geonames/schema.xml
@@ -15,7 +15,7 @@
See the License for the specific language governing permissions and
limitations under the License.
- Geonames workload schema for Apache Solr Benchmark.
+ Geonames workload schema for Apache Solr Orbit.
Field layout ported from the upstream Rally Tracks / OpenSearch Benchmark
workload.
-->
<schema name="geonames" version="1.6">
diff --git a/nyc_taxis/README.md b/nyc_taxis/README.md
index d0ec542..0394b04 100644
--- a/nyc_taxis/README.md
+++ b/nyc_taxis/README.md
@@ -59,20 +59,20 @@ is needed.
```bash
# Quick sanity-check against a local Solr cluster (reduced corpus, fewer
iterations)
-solr-benchmark run \
+solr-orbit run \
--pipeline=benchmark-only \
--target-host=localhost:8983 \
--workload=nyc_taxis \
--test-mode
# Full default benchmark (append-no-conflicts test procedure)
-solr-benchmark run \
+solr-orbit run \
--pipeline=benchmark-only \
--target-host=localhost:8983 \
--workload=nyc_taxis
# Provision Solr 9 via Docker and benchmark
-solr-benchmark run \
+solr-orbit run \
--pipeline=docker \
--distribution-version=9.10.1 \
--workload=nyc_taxis
@@ -116,11 +116,11 @@ Pass parameters with `--workload-params`, either inline
or via a JSON file:
```bash
# Inline
-solr-benchmark run --workload=nyc_taxis \
+solr-orbit run --workload=nyc_taxis \
--workload-params="bulk_indexing_clients:4,num_shards:2"
# JSON file
-solr-benchmark run --workload=nyc_taxis \
+solr-orbit run --workload=nyc_taxis \
--workload-params=/path/to/params.json
```
diff --git a/nyc_taxis/TEST_PROCEDURES.md b/nyc_taxis/TEST_PROCEDURES.md
index e46c5f0..6ee759f 100644
--- a/nyc_taxis/TEST_PROCEDURES.md
+++ b/nyc_taxis/TEST_PROCEDURES.md
@@ -4,7 +4,7 @@ This file documents the test procedures available in the
`nyc_taxis` workload.
Run any procedure with:
```bash
-solr-benchmark run --workload=nyc_taxis --test-procedure=<name>
+solr-orbit run --workload=nyc_taxis --test-procedure=<name>
```
---
diff --git a/nyc_taxis/configsets/nyc_taxis/schema.xml
b/nyc_taxis/configsets/nyc_taxis/schema.xml
index 922e802..b9052d8 100644
--- a/nyc_taxis/configsets/nyc_taxis/schema.xml
+++ b/nyc_taxis/configsets/nyc_taxis/schema.xml
@@ -15,7 +15,7 @@
See the License for the specific language governing permissions and
limitations under the License.
- NYC Taxis workload schema for Apache Solr Benchmark.
+ NYC Taxis workload schema for Apache Solr Orbit.
Field layout ported from the upstream Rally Tracks / OpenSearch Benchmark
workload.
-->
<schema name="nyc_taxis" version="1.6">