This is an automated email from the ASF dual-hosted git repository.
novatechflow pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/wayang-website.git
The following commit(s) were added to refs/heads/main by this push:
new b08b7ea5 AI guidelines doc (#120)
b08b7ea5 is described below
commit b08b7ea51b384ade91c241165590a621a28407be
Author: Zoi Kaoudi <[email protected]>
AuthorDate: Tue May 26 09:20:56 2026 +0200
AI guidelines doc (#120)
* blogpost about graduation
* Update release process documentation for Wayang
Adjusted all appropriate incubator references and links
* ai-tooling policy doc
* new Getting Started page
* docs: Restructure getting started guide with quickstart examples
- Rename docs/guide/getting-started.md → docs/guide/examples.md
- Create new docs/guide/getting-started.mdx with interactive quickstart
- Update docs/guide/installation.md sidebar position (1 → 2)
- Add wayang-architecture.svg diagram for visual explanation
* Revise AI tool usage attribution in commit messages
Updated the AI tooling policy to change the attribution format from
'Generated-by:' to 'Coauthor-by:' for AI-generated contributions.
* Update AI tooling policy for clarity and compliance
---
docs/community/ai-tooling-policy.md | 20 ++
docs/guide/{getting-started.md => examples.md} | 24 +-
docs/guide/getting-started.mdx | 332 +++++++++++++++++++++++++
docs/guide/installation.md | 6 +-
static/img/wayang-architecture.svg | 78 ++++++
5 files changed, 445 insertions(+), 15 deletions(-)
diff --git a/docs/community/ai-tooling-policy.md
b/docs/community/ai-tooling-policy.md
new file mode 100644
index 00000000..24938426
--- /dev/null
+++ b/docs/community/ai-tooling-policy.md
@@ -0,0 +1,20 @@
+# Guidelines for AI-assisted Contributions
+
+The Apache Wayang community welcomes the use of AI and generative tooling as
part of the contribution process, provided contributors follow the guidelines
below. These guidelines align with the [ASF Generative Tooling
Guidance](https://www.apache.org/legal/generative-tooling.html).
+
+1. **Verify licensing compliance.** AI-generated code may inadvertently
reproduce copyrighted material. Before submitting, ensure the output does not
include content that conflicts with the [Apache 2.0
License](https://www.apache.org/licenses/LICENSE-2.0) or the [ASF 3rd Party
Licensing Policy](https://www.apache.org/legal/resolved.html).
+
+2. **Own the code. Understand what you submit** If you cannot explain why the
code works, do not submit it. You are accountable for bugs, security issues,
and license violations in your contribution.
+
+3. **Disclose AI tool usage in commits.** When any part of a contribution was
generated or significantly assisted by an AI tool, include a `Generated-by:`
token in the commit message. For example:
+ ```
+ Fix null pointer in JdbcExecutor
+
+ Co-author by: GitHub Copilot
+ ```
+
+4. **Keep PR discussions human.** When participating in PR discussions, e.g.,
code review comments, questions, clarifications, and responses, the content
must be written by a human, not generated by an AI tool. If an AI tool (such as
GitHub Copilot) posts a comment, it must be clearly attributed as such and not
presented as the contributor's own words. This ensures that code review remains
a genuine exchange between people, preserving the quality, accountability, and
community trust that [...]
+
+5. **Use different tools for test generation**. Avoid using the same tool you
used for adding functionality with adding test units. This potentially avoids
cases where the code generated is just to make the test pass.
+
+*These guidelines will be updated as AI tooling and the legal landscape around
it continue to evolve. Questions or suggestions can be raised on the [dev
mailing list](https://wayang.apache.org/docs/community/mailinglist).*
diff --git a/docs/guide/getting-started.md b/docs/guide/examples.md
similarity index 93%
rename from docs/guide/getting-started.md
rename to docs/guide/examples.md
index b8fe3c09..952e2389 100644
--- a/docs/guide/getting-started.md
+++ b/docs/guide/examples.md
@@ -1,7 +1,7 @@
---
-title: Getting started
-sidebar_position: 2
-id: getting-started
+title: Installation and Examples
+sidebar_position: 10
+id: examples
---
<!--
@@ -23,10 +23,10 @@ id: getting-started
-->
## Requirements
-Apache Wayang is built upon the foundations of Java 11 and Scala 2.12,
providing a robust and versatile platform for data processing applications. If
you intend to build Wayang from source, you will also need to have Apache
Maven, the popular build automation tool, installed on your system.
Additionally, be mindful that some of the processing platforms supported by
Wayang may have their own specific installation requirements.
+Apache Wayang is built upon the foundations of Java 11 and Scala 2.12,
providing a robust and versatile platform for data processing applications. If
you intend to build Wayang from source, you wi[...]
### Get Wayang
-Apache Wayang is readily available through Maven Central, facilitating
seamless integration into your development workflow. For instance, to utilize
Wayang in your Maven-based project, simply add the following dependency to your
project's POM file:
+Apache Wayang is readily available through Maven Central, facilitating
seamless integration into your development workflow. For instance, to utilize
Wayang in your Maven-based project, simply add [...]
```xml
<dependency>
<groupId>org.apache.wayang</groupId>
@@ -69,7 +69,7 @@ If you need to rebuild Wayang, e.g., to use a different Scala
version, you can s
```
### Configure Wayang
-To enable Apache Wayang's smooth operation, you need to equip it with details
about your processing platforms' capabilities and how to interact with them. A
default configuration is available for initial testing, but creating a
properties file is generally preferable for fine-tuning the configuration to
suit your specific requirements. To harness this personalized configuration
effortlessly, launch your application via
+To enable Apache Wayang's smooth operation, you need to equip it with details
about your processing platforms' capabilities and how to interact with them. A
default configuration is available for [...]
```shell
$ java -Dwayang.configuration=url://to/my/wayang.properties ...
```
@@ -79,7 +79,7 @@ Essential configuration settings:
* `wayang.core.log.enabled (= true)`: whether to log execution statistics
to allow learning better cardinality and cost estimators for the optimizer
* `wayang.core.log.executions (= ~/.wayang/executions.json)` where to log
execution times of operator groups
* `wayang.core.log.cardinalities (= ~/.wayang/cardinalities.json)` where
to log cardinality measurements
- * `wayang.core.optimizer.instrumentation (=
org.apache.wayang.core.profiling.OutboundInstrumentationStrategy)`: where to
measure cardinalities in Wayang plans; other options are
`org.apache.wayang.core.profiling.NoInstrumentationStrategy` and
`org.apache.wayang.core.profiling.FullInstrumentationStrategy`
+ * `wayang.core.optimizer.instrumentation (=
org.apache.wayang.core.profiling.OutboundInstrumentationStrategy)`: where to
measure cardinalities in Wayang plans; other options are `org.apache.wa[...]
* `wayang.core.optimizer.reoptimize (= false)`: whether to progressively
optimize Wayang plans
* `wayang.basic.tempdir (= file:///tmp)`: where to store temporary files,
in particular for inter-platform communication
* Java Streams
@@ -109,10 +109,10 @@ Essential configuration settings:
* `wayang.postgres.cpu.mhz (= 2700)`: clock frequency of processor
PostgreSQL runs on in MHz
* `wayang.postgres.cpu.cores (= 2)`: number of cores PostgreSQL runs on
-To effectively define your applications with Apache Wayang, utilize its Scala
or Java API, conveniently found within the `wayang-api` module. For clear
illustrations, refer to the provided examples below.
+To effectively define your applications with Apache Wayang, utilize its Scala
or Java API, conveniently found within the `wayang-api` module. For clear
illustrations, refer to the provided exampl[...]
## Cost Functions
-Wayang provides a utility to learn cost functions from historical execution
data. Specifically, Wayang can learn configurations for load profile estimators
(that estimate CPU load, disk load etc.) for both operators and UDFs, as long
as the configuration provides a template for those estimators.
+Wayang provides a utility to learn cost functions from historical execution
data. Specifically, Wayang can learn configurations for load profile estimators
(that estimate CPU load, disk load etc.[...]
As an example, the `JavaMapOperator` draws its load profile estimator
configuration via the configuration key `wayang.java.map.load`.
Now, it is possible to specify a load profile estimator template in the
configuration under the key `<original key>.template`, e.g.:
@@ -122,7 +122,7 @@ wayang.java.map.load.template = {\
"cpu":"?*in0"\
}
```
-This template encapsulates a load profile estimator that requires at minimum
one input cardinality and one output cardinality. Furthermore, it simulates CPU
load by assuming a direct relationship with the input cardinality. However,
more complex functions are possible.
+This template encapsulates a load profile estimator that requires at minimum
one input cardinality and one output cardinality. Furthermore, it simulates CPU
load by assuming a direct relationship[...]
In particular, you can use
* the variables `in0`, `in1`, ... and `out0`, `out1`, ... to incorporate the
input and output cardinalities, respectively;
@@ -131,12 +131,12 @@ In particular, you can use
* the functions `min(x0, x1, ...))`, `max(x0, x1, ...)`, `abs(x)`, `log(x,
base)`, `ln(x)`, `ld(x)`;
* and the constants `e` and `pi`.
-While Apache Wayang provides templates for all execution operators, you will
need to explicitly define your user-defined functions (UDFs) by specifying
their cost functions, which are based on configuration parameters. This
involves creating an initial specification and template for each UDF.
+While Apache Wayang provides templates for all execution operators, you will
need to explicitly define your user-defined functions (UDFs) by specifying
their cost functions, which are based on co[...]
As soon as execution data has been collected, you can initiate:
```shell
java ... org.apache.wayang.profiler.ga.GeneticOptimizerApp [configuration URL
[execution log]]
```
-This tool will attempt to determine suitable values for the question marks
(`?`) within the load profile estimator templates, aligning them with the
collected execution data and pre-defined configuration entries for the load
profile estimators. These optimized values can then be directly incorporated
into your configuration.
+This tool will attempt to determine suitable values for the question marks
(`?`) within the load profile estimator templates, aligning them with the
collected execution data and pre-defined confi[...]
## Examples
diff --git a/docs/guide/getting-started.mdx b/docs/guide/getting-started.mdx
new file mode 100644
index 00000000..24cf4792
--- /dev/null
+++ b/docs/guide/getting-started.mdx
@@ -0,0 +1,332 @@
+---
+title: Getting started
+sidebar_position: 1
+description: Write a data pipeline once and let Apache Wayang run it on the
best available engine — your laptop, Spark, Flink, or a database.
+---
+
+import Tabs from '@theme/Tabs';
+import TabItem from '@theme/TabItem';
+
+# Getting Started
+
+**Write your data pipeline once. Run it anywhere.**
+
+You write your pipeline against a single API, then decide how it runs. Point
it at one
+engine and it runs there — or hand Wayang's cost-based optimizer the choice
and let it pick
+the best platform for each step across your laptop, Apache Spark, Apache
Flink, or a
+database, even splitting a single job across several. Either way, when your
data outgrows
+one machine you don't rewrite anything — you just make another engine
available.
+
+This page gets you from zero to a running cross-platform pipeline in a few
minutes.
+
+---
+
+## How it works
+
+Most data tools lock you into one engine. Pick Spark, and your code is Spark
code forever.
+Outgrow it, or need a database in the mix, and you rewrite.
+
+Wayang sits one level up. You write a pipeline against Wayang's API and
register the engines
+you *have* — then it's your call. Want control? Register one engine and it
runs there. Want
+it handled? Register several and let Wayang's cost-based optimizer pick the
best one for each
+step:
+
+
+
+Same code on your laptop and on a 100-node cluster.
+
+---
+
+## Quickstart
+
+We'll run a word count locally first — no cluster, nothing to install on a
server — then
+make Spark available with a one-line change. The pipeline itself never
changes; only the
+set of engines you register does.
+
+### 1. Run locally
+
+<Tabs groupId="wayang-language">
+<TabItem value="java" label="Java" default>
+
+```java
+import org.apache.wayang.core.api.Configuration;
+import org.apache.wayang.core.api.WayangContext;
+import org.apache.wayang.api.JavaPlanBuilder;
+import org.apache.wayang.basic.data.Tuple2;
+import org.apache.wayang.java.Java;
+import java.util.Arrays;
+
+public class WordCount {
+ public static void main(String[] args) {
+ // Register ONLY the local Java engine → runs on your machine, no
cluster needed.
+ WayangContext wayang = new WayangContext(new Configuration())
+ .withPlugin(Java.basicPlugin());
+
+ new JavaPlanBuilder(wayang)
+ .withJobName("WordCount")
+ .withUdfJarOf(WordCount.class)
+ .readTextFile("file:///path/to/input.txt")
+ .flatMap(line -> Arrays.asList(line.split("\\W+")))
+ .filter(word -> !word.isEmpty())
+ .map(word -> new Tuple2<>(word.toLowerCase(), 1))
+ .reduceByKey(Tuple2::getField0,
+ (t1, t2) -> new Tuple2<>(t1.getField0(),
t1.getField1() + t2.getField1()))
+ .writeTextFile("file:///path/to/output.txt", t ->
t.getField0() + ": " + t.getField1());
+ }
+}
+```
+
+</TabItem>
+<TabItem value="scala" label="Scala">
+
+```scala
+import org.apache.wayang.api._
+import org.apache.wayang.core.api.{Configuration, WayangContext}
+import org.apache.wayang.java.Java
+
+object WordCount {
+ def main(args: Array[String]): Unit = {
+ // Register ONLY the local Java engine → runs on your machine, no cluster
needed.
+ val wayangCtx = new WayangContext(new Configuration)
+ wayangCtx.register(Java.basicPlugin)
+
+ new PlanBuilder(wayangCtx)
+ .withJobName("WordCount")
+ .withUdfJarsOf(this.getClass)
+ .readTextFile("file:///path/to/input.txt")
+ .flatMap(_.split("\\W+"))
+ .filter(_.nonEmpty)
+ .map(word => (word.toLowerCase, 1))
+ .reduceByKey(_._1, (c1, c2) => (c1._1, c1._2 + c2._2))
+ .writeTextFile("file:///path/to/output.txt", t => s"${t._1}: ${t._2}")
+ }
+}
+```
+
+</TabItem>
+<TabItem value="python" label="Python">
+
+```python
+from pywy.dataquanta import WayangContext
+from pywy.platforms.java import JavaPlugin
+
+# Register ONLY the local Java engine → runs on your machine, no cluster
needed.
+ctx = WayangContext().register({JavaPlugin})
+
+(ctx
+ .textfile("file:///path/to/input.txt")
+ .flatmap(lambda line: line.split())
+ .filter(lambda word: word.strip() != "")
+ .map(lambda word: (word.lower(), 1))
+ .reduce_by_key(lambda t: t[0], lambda t1, t2: (t1[0], int(t1[1]) +
int(t2[1])))
+ .store_textfile("file:///path/to/output.txt"))
+```
+
+</TabItem>
+</Tabs>
+
+It executes locally. Good for development, tests, and small data.
+
+### 2. Run it on Spark
+
+Now run the *exact same pipeline* on Spark instead of locally. You don't touch
the pipeline —
+you change which platform you register: comment out Java and register Spark.
+
+<Tabs groupId="wayang-language">
+<TabItem value="java" label="Java" default>
+
+```java
+import org.apache.wayang.spark.Spark; // ← swap the import
+
+// Same pipeline as before — only the registered platform changed.
+WayangContext wayang = new WayangContext(new Configuration())
+ // .withPlugin(Java.basicPlugin()) // ← comment out the
local engine
+ .withPlugin(Spark.basicPlugin()); // ← register Spark
instead
+```
+
+</TabItem>
+<TabItem value="scala" label="Scala">
+
+```scala
+import org.apache.wayang.spark.Spark // ← swap the import
+
+// Same pipeline as before — only the registered platform changed.
+val wayangCtx = new WayangContext(new Configuration)
+// wayangCtx.register(Java.basicPlugin) // ← comment out the local
engine
+wayangCtx.register(Spark.basicPlugin) // ← register Spark instead
+```
+
+</TabItem>
+<TabItem value="python" label="Python">
+
+```python
+from pywy.platforms.spark import SparkPlugin // ← use Spark instead of
Java
+
+# Same pipeline as before — only the registered platform changed.
+ctx = WayangContext().register({SparkPlugin})
+```
+
+</TabItem>
+</Tabs>
+
+Run it again. The same pipeline now executes on Spark — you changed *where* it
runs without
+changing *what* it does. Switch to Flink or any other supported platform the
same way: swap
+the import and the registered plugin.
+
+
+### 3. Register both and let the optimizer choose
+
+This is the point of Wayang. In practice you don't have to pick a platform at
all: you register every
+engine you have and let the optimizer choose the best one for each step — even
+splitting a single job across engines. The pipeline is still the same; you
just stop deciding
+where it runs.
+
+<Tabs groupId="wayang-language">
+<TabItem value="java" label="Java" default>
+
+```java
+// Register BOTH platforms — Wayang's optimizer decides which to use per step.
+WayangContext wayang = new WayangContext(new Configuration())
+ .withPlugin(Java.basicPlugin())
+ .withPlugin(Spark.basicPlugin());
+```
+
+</TabItem>
+<TabItem value="scala" label="Scala">
+
+```scala
+// Register BOTH platforms — Wayang's optimizer decides which to use per step.
+val wayangCtx = new WayangContext(new Configuration)
+wayangCtx.register(Java.basicPlugin)
+wayangCtx.register(Spark.basicPlugin)
+```
+
+</TabItem>
+<TabItem value="python" label="Python">
+
+```python
+# Register BOTH platforms — Wayang's optimizer decides which to use per step.
+ctx = WayangContext().register({JavaPlugin, SparkPlugin})
+```
+
+</TabItem>
+</Tabs>
+
+Now Wayang owns the placement decision. For each operator it estimates the
cost on every
+registered platform and picks the cheapest — keeping a small job entirely
local, pushing a
+large one onto Spark, or mixing both within the same job as the data demands.
You wrote the
+pipeline once in step 1; steps 2 and 3 only changed which engines were on the
table.
+
+> On a tiny input you'll see it keep everything local (that's the optimizer
working
+> correctly, not ignoring Spark). The cross-platform decisions show up once
the data is big
+> enough for them to pay off. Read [How Wayang chooses a
platform](https://wayang.apache.org/docs/introduction/about) for what
+> drives those choices.
+
+---
+
+## Install
+
+Replace WAYANG_VERSION below with [latest maven
release](https://mvnrepository.com/artifact/org.apache.wayang).
+
+<Tabs groupId="wayang-language">
+<TabItem value="java" label="Java/Scala" default>
+
+Maven:
+
+```xml
+<dependency>
+ <groupId>org.apache.wayang</groupId>
+ <artifactId>wayang-core</artifactId>
+ <version>WAYANG_VERSION</version>
+</dependency>
+<dependency>
+ <groupId>org.apache.wayang</groupId>
+ <artifactId>wayang-basic</artifactId>
+ <version>WAYANG_VERSION</version>
+</dependency>
+<dependency>
+ <groupId>org.apache.wayang</groupId>
+ <artifactId>wayang-api-scala-java</artifactId>
+ <version>WAYANG_VERSION</version>
+</dependency>
+<!-- add one artifact per engine you want available -->
+<dependency>
+ <groupId>org.apache.wayang</groupId>
+ <artifactId>wayang-java</artifactId>
+ <version>WAYANG_VERSION</version>
+</dependency>
+<dependency>
+ <groupId>org.apache.wayang</groupId>
+ <artifactId>wayang-spark</artifactId>
+ <version>WAYANG_VERSION</version>
+</dependency>
+```
+
+Or build from source (always works, no published artifact needed). Just make
sure to use the [latest snapshot
version](https://github.com/apache/wayang/blob/main/pom.xml):
+
+```bash
+git clone https://github.com/apache/wayang.git
+cd wayang
+./mvnw clean install -DskipTests
+```
+
+To run on Spark, you'll need Apache Spark 3+ on your `PATH`.
+
+</TabItem>
+<TabItem value="python" label="Python">
+
+:::note
+Getting Python running today is involved — you build both the Python
+package and the Wayang backend from source and start a local server before any
pipeline
+runs. A simpler one-command path (a prebuilt Docker image) is something the
project wants to
+offer, and help is welcome.
+:::
+
+pywayang has no PyPI release yet, and it's a client that talks to a running
Wayang REST
+API — so setup is two parts: install the Python package, then start the Wayang
backend.
+
+**1. Build and install the Python package** (from the repo root):
+
+```bash
+cd python
+pip install --upgrade build # may also require the python3-venv system
package
+python3 -m build
+python3 -m pip install dist/pywy-WAYANG_VERSION.tar.gz
+```
+
+**2. Build the Wayang assembly and start the REST API** (from the repo root):
+
+```bash
+./mvnw clean package -pl :wayang-assembly -Pdistribution
+cd wayang-assembly/target/
+tar -xf apache-wayang-assembly-WAYANG_VERSION-dist.tar.gz
+cd wayang-WAYANG_VERSION
+./bin/wayang-submit org.apache.wayang.api.json.Main &
+```
+
+Before packaging, set the Python worker paths in
+`wayang-api/wayang-api-python/src/main/resources/wayang-api-python-defaults.properties`
+— the location of `python/src/pywy/execution/worker.py`, your `python3`
command, and your
+site-packages directory. You'll also need a Java 11+ runtime available.
+
+With the REST API running, the Python examples above will execute. See the
+[pywayang README](https://github.com/apache/wayang/tree/main/python) for full
details.
+
+</TabItem>
+</Tabs>
+
+---
+
+<!--
+
+## What to read next
+
+- **[How Wayang chooses a platform](./examples/)** — how the optimizer decides
where each step runs.
+- **[Supported platforms](./api-documentation/)** — Spark, Flink, Java
streams, Postgres, and more.
+- **[Add your own platform](./examples/)** — Wayang's plugin architecture lets
you teach it new engines.
+- **[Join the
community](https://wayang.apache.org/docs/community/mailinglist)** — mailing
lists, where questions get answered and contributors start.
+---
+-->
+
+*Stuck on the first run? That's a documentation bug, and we want to know.
+Ask on the [user mailing
list](https://wayang.apache.org/docs/community/mailinglist).*
\ No newline at end of file
diff --git a/docs/guide/installation.md b/docs/guide/installation.md
index 3bb53b45..10ee2d5e 100644
--- a/docs/guide/installation.md
+++ b/docs/guide/installation.md
@@ -1,6 +1,6 @@
---
title: How to build Wayang
-sidebar_position: 1
+sidebar_position: 2
id: installation
---
@@ -59,8 +59,8 @@ echo "export PATH=${PATH}:${WAYANG_HOME}/bin" >> ~/.zshrc
source ~/.zshrc
```
### Others
-- You need to install Apache Spark version 3 or higher. Don’t forget to set
the `SPARK_HOME` environment variable.
-- You need to install Apache Hadoop version 3 or higher. Don’t forget to set
the `HADOOP_HOME` environment variable.
+- You need to install Apache Spark version 3 or higher. Don't forget to set
the `SPARK_HOME` environment variable.
+- You need to install Apache Hadoop version 3 or higher. Don't forget to set
the `HADOOP_HOME` environment variable.
## Run the program
diff --git a/static/img/wayang-architecture.svg
b/static/img/wayang-architecture.svg
new file mode 100644
index 00000000..4ad98507
--- /dev/null
+++ b/static/img/wayang-architecture.svg
@@ -0,0 +1,78 @@
+<svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 760 360" role="img"
+ aria-label="A single pipeline, written once, feeds the Wayang optimizer,
which routes work to the best available engine: Local, Spark, Flink, Postgres,
and others.">
+ <style>
+ svg { font-family: ui-sans-serif, system-ui, -apple-system, "Segoe UI",
Roboto, Helvetica, Arial, sans-serif; }
+ :root {
+ --box-fill: #ffffff;
+ --box-stroke: #cbd5e1;
+ --engine-fill: #f8fafc;
+ --accent: #2563eb;
+ --accent-fill: #eff6ff;
+ --text: #1e293b;
+ --muted: #64748b;
+ --line: #94a3b8;
+ }
+ @media (prefers-color-scheme: dark) {
+ :root {
+ --box-fill: #1e293b;
+ --box-stroke: #334155;
+ --engine-fill: #0f172a;
+ --accent: #60a5fa;
+ --accent-fill: #15294a;
+ --text: #e2e8f0;
+ --muted: #94a3b8;
+ --line: #475569;
+ }
+ }
+ .box { fill: var(--box-fill); stroke: var(--box-stroke);
stroke-width: 1.5; }
+ .opt { fill: var(--accent-fill); stroke: var(--accent);
stroke-width: 2; }
+ .engine { fill: var(--engine-fill); stroke: var(--box-stroke);
stroke-width: 1.5; }
+ .line { stroke: var(--line); stroke-width: 2; fill: none; }
+ .title { fill: var(--text); font-size: 16px; font-weight: 600; }
+ .optttl { fill: var(--accent);font-size: 16px; font-weight: 700; }
+ .sub { fill: var(--muted); font-size: 12px; }
+ .eng { fill: var(--text); font-size: 15px; font-weight: 600; }
+ .cap { fill: var(--muted); font-size: 12px; font-style: italic; }
+ </style>
+
+ <defs>
+ <marker id="arrow" viewBox="0 0 10 10" refX="8" refY="5" markerWidth="7"
markerHeight="7" orient="auto-start-reverse">
+ <path d="M 0 0 L 10 5 L 0 10 z" fill="var(--line)"/>
+ </marker>
+ </defs>
+
+ <!-- Pipeline (written once) -->
+ <rect class="box" x="280" y="20" width="200" height="54" rx="10"/>
+ <text class="title" x="380" y="45" text-anchor="middle">Your pipeline</text>
+ <text class="sub" x="380" y="63" text-anchor="middle">written once</text>
+
+ <!-- arrow: pipeline -> optimizer -->
+ <line class="line" x1="380" y1="74" x2="380" y2="112"
marker-end="url(#arrow)"/>
+
+ <!-- Optimizer -->
+ <rect class="opt" x="250" y="118" width="260" height="58" rx="10"/>
+ <text class="optttl" x="380" y="144" text-anchor="middle">Wayang
optimizer</text>
+ <text class="sub" x="380" y="163" text-anchor="middle">chooses where each
step runs</text>
+
+ <!-- tree connector from optimizer down to the engine row -->
+ <path class="line" d="M 380 176 L 380 210 M 80 210 L 680 210"/>
+ <line class="line" x1="80" y1="210" x2="80" y2="266"
marker-end="url(#arrow)"/>
+ <line class="line" x1="280" y1="210" x2="280" y2="266"
marker-end="url(#arrow)"/>
+ <line class="line" x1="480" y1="210" x2="480" y2="266"
marker-end="url(#arrow)"/>
+ <line class="line" x1="680" y1="210" x2="680" y2="266"
marker-end="url(#arrow)"/>
+
+ <!-- Engines -->
+ <rect class="engine" x="20" y="270" width="120" height="50" rx="8"/>
+ <text class="eng" x="80" y="300" text-anchor="middle">Local</text>
+
+ <rect class="engine" x="220" y="270" width="120" height="50" rx="8"/>
+ <text class="eng" x="280" y="300" text-anchor="middle">Spark</text>
+
+ <rect class="engine" x="420" y="270" width="120" height="50" rx="8"/>
+ <text class="eng" x="480" y="300" text-anchor="middle">Flink</text>
+
+ <rect class="engine" x="620" y="270" width="120" height="50" rx="8"/>
+ <text class="eng" x="680" y="300" text-anchor="middle">Postgres</text>
+
+ <text class="cap" x="380" y="346" text-anchor="middle">…and any other
supported platform</text>
+</svg>