This is an automated email from the ASF dual-hosted git repository.
wu-sheng pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/skywalking.git
The following commit(s) were added to refs/heads/master by this push:
new 53baf8e5da Fix runtime-rule (MAL/LAL) hot-update in no-init mode + k8s
cluster node identity (#13909)
53baf8e5da is described below
commit 53baf8e5da6c45670da6509eb7fcceed7b082ceb
Author: 吴晟 Wu Sheng <[email protected]>
AuthorDate: Sun Jun 14 08:04:36 2026 +0800
Fix runtime-rule (MAL/LAL) hot-update in no-init mode + k8s cluster node
identity (#13909)
Two bugs in the runtime-rule (DSL hot-update) cluster path, both confirmed
end-to-end on a local kind cluster:
**1. Runtime-rule schema changes were inoperative in `no-init` mode** — the
mode every production OAP cluster runs (a one-shot `-Dmode=init` Job creates
the static schema; the OAP Deployment runs `-Dmode=no-init`). A runtime
`addOrUpdate` introducing a new metric blocked forever in the storage
installer's init-node poll loop (`ModelInstaller.whenCreating`), because the
loop was gated on `RunningMode` rather than the operation's intent.
`/delete?mode=revertToBundled` recreate and Banya [...]
**2. Runtime-rule cross-node writes failed with `HTTP 400
forward_self_loop` on a multi-replica Kubernetes cluster.** Every OAP replica
shared the cluster `selfNodeId` `0.0.0.0_11800` (derived from the `0.0.0.0`
agent gRPC bind host via `TelemetryRelatedContext`), so the main's self-loop
guard rejected a legitimate peer-to-peer Forward as if it had looped back.
**Fix:** resolve the runtime-rule node identity from the unique per-pod
`SKYWALKING_COLLECTOR_UID` (the pod UID injected by t [...]
**Tests:** new `ModelInstallerNoInitTest` (UT) for the no-init create
chokepoint; the runtime-rule cluster e2e is converted from docker-compose
(default mode — which never exercised either bug) to a kind + skywalking-helm
`no-init` cluster (`oap.replicas=2`) driving the apply / STRUCTURAL /
inactivate / delete lifecycle, cross-node convergence, and the cross-node
Forward path.
---
.github/workflows/skywalking.yaml | 3 +-
docs/en/changes/changes.md | 2 +
.../module/RuntimeRuleModuleProvider.java | 67 ++++++--
.../receiver/runtimerule/reconcile/DSLManager.java | 83 +++++-----
.../server/core/storage/model/ModelInstaller.java | 30 +++-
.../core/storage/model/StorageManipulationOpt.java | 42 ++++-
.../storage/model/ModelInstallerNoInitTest.java | 140 ++++++++++++++++
.../plugin/banyandb/BanyanDBIndexInstaller.java | 39 +++--
.../cases/runtime-rule/cluster/cluster-flow.sh | 183 ++++++++++++++-------
.../cases/runtime-rule/cluster/docker-compose.yml | 93 -----------
test/e2e-v2/cases/runtime-rule/cluster/e2e.yaml | 84 +++++++---
.../cases/runtime-rule/cluster/expected/ok.txt | 1 -
.../cluster/expected/ready-replicas.txt | 1 +
test/e2e-v2/cases/runtime-rule/cluster/kind.yaml | 23 +++
14 files changed, 529 insertions(+), 262 deletions(-)
diff --git a/.github/workflows/skywalking.yaml
b/.github/workflows/skywalking.yaml
index 504dc19a7f..05adfffa9e 100644
--- a/.github/workflows/skywalking.yaml
+++ b/.github/workflows/skywalking.yaml
@@ -399,8 +399,9 @@ jobs:
env: ES_VERSION=8.18.8
- name: Runtime Rule LAL Hot-Update
config: test/e2e-v2/cases/runtime-rule/lal/e2e.yaml
- - name: Runtime Rule Cluster Convergence
+ - name: Runtime Rule Cluster (kind)
config: test/e2e-v2/cases/runtime-rule/cluster/e2e.yaml
+ runs-on: ubuntu-24.04
- name: DSL Debug API — MAL
config: test/e2e-v2/cases/dsl-debugging/mal/e2e.yaml
- name: DSL Debug API — OAL
diff --git a/docs/en/changes/changes.md b/docs/en/changes/changes.md
index bffa0e8405..51aa0dfa79 100644
--- a/docs/en/changes/changes.md
+++ b/docs/en/changes/changes.md
@@ -248,6 +248,8 @@
refcount-tracked and unregistered when the last declaring rule is removed.
See
[runtime-rule-hot-update.md#dynamic-layers](../concepts-and-designs/runtime-rule-hot-update.md)
for the conflict rules and limitations.
+* Fix: runtime-rule (MAL/LAL hot-update) schema changes now work in `no-init`
mode — the deployment mode every production cluster runs. Previously a runtime
`addOrUpdate` that introduced a new metric blocked forever in the storage
installer's init-node poll loop (`ModelInstaller.whenCreating`) on a `no-init`
OAP, because the gate keyed off `RunningMode` rather than the operation's
intent; the `/delete?mode=revertToBundled` recreate and BanyanDB in-place shape
updates were dead the same w [...]
+* Fix: runtime-rule cross-node writes no longer fail with `HTTP 400
forward_self_loop` on a multi-replica Kubernetes cluster. Every OAP replica
shared the cluster `selfNodeId` `0.0.0.0_11800` (derived from the `0.0.0.0`
agent gRPC bind host via `TelemetryRelatedContext`), so the main's self-loop
guard rejected a legitimate peer-to-peer Forward as if it had looped back. The
runtime-rule node identity now prefers the unique per-pod
`SKYWALKING_COLLECTOR_UID` (the pod UID injected by the he [...]
* Fix: remove the redundant tags from the `envoy-ai-gateway.yaml` LAL
configuration.
* Add Zipkin Virtual GenAI e2e test. Use `zipkin_json` exporter to avoid
protobuf dependency conflict
between `opentelemetry-exporter-zipkin-proto-http` (protobuf~=3.12) and
`opentelemetry-proto` (protobuf>=5.0).
diff --git
a/oap-server/server-admin/runtime-rule/src/main/java/org/apache/skywalking/oap/server/receiver/runtimerule/module/RuntimeRuleModuleProvider.java
b/oap-server/server-admin/runtime-rule/src/main/java/org/apache/skywalking/oap/server/receiver/runtimerule/module/RuntimeRuleModuleProvider.java
index f75d770512..ec7197d5de 100644
---
a/oap-server/server-admin/runtime-rule/src/main/java/org/apache/skywalking/oap/server/receiver/runtimerule/module/RuntimeRuleModuleProvider.java
+++
b/oap-server/server-admin/runtime-rule/src/main/java/org/apache/skywalking/oap/server/receiver/runtimerule/module/RuntimeRuleModuleProvider.java
@@ -219,6 +219,15 @@ public class RuntimeRuleModuleProvider extends
ModuleProvider {
*/
private static final long SCHEDULER_INITIAL_DELAY_SECONDS = 2L;
+ /**
+ * Env var carrying this OAP's unique per-node identity — the Kubernetes
pod UID, injected
+ * by the skywalking-helm chart / swck operator from {@code metadata.uid}.
Used as the
+ * runtime-rule cluster {@code selfNodeId} when present, because the
telemetry-id fallback
+ * (gRPC {@code host_port}) collides across replicas under k8s where the
bind host is
+ * {@code 0.0.0.0} (every pod reports {@code 0.0.0.0_11800}).
+ */
+ private static final String COLLECTOR_UID_ENV = "SKYWALKING_COLLECTOR_UID";
+
private RuntimeRuleModuleConfig moduleConfig;
private ScheduledExecutorService reconcilerExecutor;
private DSLManager dslManager;
@@ -272,7 +281,12 @@ public class RuntimeRuleModuleProvider extends
ModuleProvider {
// cluster gRPC bus (default 11800). Privileged admin RPCs stay on the
// admin-only port (default 17129) so a compromised node on the agent
// network cannot reach Suspend/Resume/Forward.
- final String selfNodeId = TelemetryRelatedContext.INSTANCE.getId();
+ // Resolve this node's stable, unique cluster identity HERE in start()
— before
+ // notifyAfterCompleted() applies any rule — so the node knows who it
is before it
+ // forwards a write to the main or broadcasts Suspend/Resume. Must be
unique per
+ // replica: it is the Forward/Suspend/Resume sender id and the key the
receiver's
+ // self-loop guard compares against. See resolveSelfNodeId().
+ final String selfNodeId = resolveSelfNodeId();
final AdminClusterChannelManager adminPeerChannels =
getManager().find(AdminServerModule.NAME).provider()
.getService(AdminClusterChannelManager.class);
@@ -343,24 +357,25 @@ public class RuntimeRuleModuleProvider extends
ModuleProvider {
// applies under {@code withSchemaChange} if this node resolves as
main. Backend DDL is
// idempotent so the re-apply costs nothing.
try {
- // atBoot=true so a no-init OAP picks verifySchemaOnly and refuses
to
- // start with a missing or shape-mismatched backend (k8s pod
backloop)
+ // atBoot=true so a cluster peer picks verifySchemaOnly and
refuses to
+ // start against a missing or shape-mismatched backend (k8s pod
backloop)
// instead of silently registering local workers against schema
that
- // doesn't exist. Init / default-mode OAPs are unaffected — their
boot
- // opt mirrors the standard tick choice for those modes.
+ // doesn't exist; the main picks withSchemaChange and re-creates
missing
+ // runtime schema. The choice is by cluster main-ness, not running
mode
+ // (see DSLManager.tickStorageOpt); init mode is the lone
exception.
dslManager.tick(true);
log.info("Runtime rule dslManager: synchronous first tick
completed "
+ "(runtime-only DB rows are now applied locally).");
} catch (final RuntimeException re) {
- // Boot pass under verifySchemaOnly re-throws missing/mismatch as a
- // RuntimeException so module bootstrap aborts. Translate to
- // ModuleStartException so the OAP exit message points the
operator at
- // the right place.
+ // The boot pass re-throws as a RuntimeException so module
bootstrap aborts —
+ // a peer's verifySchemaOnly hitting a missing/mismatched backend,
or a main's
+ // withSchemaChange failing to create it. Translate to
ModuleStartException so
+ // the OAP exit message points the operator at the right place.
throw new ModuleStartException(
- "Runtime rule dslManager boot pass failed under
verifySchemaOnly; "
- + "the backend schema is missing or diverges from the
declared rule. "
- + "Bring up the init OAP first or align rule files with
the backend, "
- + "then restart this node.",
+ "Runtime rule dslManager boot pass failed: backend schema is
missing, "
+ + "diverges from the declared rule, or could not be
created. On a peer, "
+ + "bring up the cluster main (or init OAP) first; on the
main, align the "
+ + "rule files with the backend, then restart this node.",
re);
} catch (final Throwable t) {
log.warn("Runtime rule dslManager: synchronous first tick failed —
"
@@ -393,6 +408,32 @@ public class RuntimeRuleModuleProvider extends
ModuleProvider {
SCHEDULER_INITIAL_DELAY_SECONDS, intervalSeconds);
}
+ /**
+ * Resolve this node's unique, stable runtime-rule cluster identity.
Prefers the Kubernetes
+ * pod UID ({@value #COLLECTOR_UID_ENV}, injected by the helm chart / swck
operator from
+ * {@code metadata.uid}) because it is unique per replica; falls back to
the telemetry id
+ * ({@code host_port}) for non-k8s deployments where each node already has
a distinct host.
+ *
+ * <p>Why not the telemetry id directly: under Kubernetes the agent gRPC
bind host is
+ * {@code 0.0.0.0}, so every replica's telemetry id is {@code
0.0.0.0_11800} — identical.
+ * That collision makes the receiver's self-loop guard (sender id == own
id) reject a
+ * legitimate peer-to-peer Forward as if it had looped back, breaking
cross-node writes on
+ * any multi-replica k8s cluster. {@code MainRouter} already routes
correctly off the
+ * cluster peer addresses (pod IPs); only the self-identity used for loop
suppression needs
+ * to be unique, which the pod UID guarantees.
+ */
+ private String resolveSelfNodeId() {
+ final String collectorUid = System.getenv(COLLECTOR_UID_ENV);
+ if (collectorUid != null && !collectorUid.trim().isEmpty()) {
+ log.info("Runtime rule: selfNodeId from {} (pod UID) = {}",
COLLECTOR_UID_ENV, collectorUid);
+ return collectorUid;
+ }
+ final String telemetryId = TelemetryRelatedContext.INSTANCE.getId();
+ log.info("Runtime rule: {} not set; selfNodeId falls back to telemetry
id = {} "
+ + "(ensure it is unique per node in a multi-node cluster).",
COLLECTOR_UID_ENV, telemetryId);
+ return telemetryId;
+ }
+
@Override
public String[] requiredModules() {
return new String[] {
diff --git
a/oap-server/server-admin/runtime-rule/src/main/java/org/apache/skywalking/oap/server/receiver/runtimerule/reconcile/DSLManager.java
b/oap-server/server-admin/runtime-rule/src/main/java/org/apache/skywalking/oap/server/receiver/runtimerule/reconcile/DSLManager.java
index bd46ae69ed..102a05ebbe 100644
---
a/oap-server/server-admin/runtime-rule/src/main/java/org/apache/skywalking/oap/server/receiver/runtimerule/reconcile/DSLManager.java
+++
b/oap-server/server-admin/runtime-rule/src/main/java/org/apache/skywalking/oap/server/receiver/runtimerule/reconcile/DSLManager.java
@@ -229,15 +229,17 @@ public final class DSLManager {
/**
* Variant invoked once at boot from {@code
RuntimeRuleModuleProvider.notifyAfterCompleted}
- * with {@code atBoot=true}. The boot pass on a no-init OAP picks
+ * with {@code atBoot=true}. The boot pass on a cluster <em>peer</em> picks
* {@link StorageManipulationOpt#verifySchemaOnly()} so missing or
shape-mismatched
* backend schema fails the bootstrap (k8s pod backloop) instead of
silently
- * proceeding. The scheduled executor calls the no-arg overload so
subsequent ticks
- * stay on the lenient {@code withoutSchemaChange} retry path.
+ * proceeding; the <em>main</em> picks {@link
StorageManipulationOpt#withSchemaChange()}
+ * so it re-creates any missing runtime schema. The scheduled executor
calls the no-arg
+ * overload so subsequent peer ticks stay on the lenient {@code
withoutSchemaChange}
+ * retry path.
*
- * <p>Boot semantics are scoped to no-init mode only — init-mode OAPs
continue to
- * pick {@link StorageManipulationOpt#schemaCreateIfAbsent()} (boot
creates), and
- * default-mode OAPs continue to pick by cluster main-ness.
+ * <p>The choice is by cluster main-ness, not running mode — no-init and
default behave
+ * identically (see {@link #tickStorageOpt}). Init mode is the one
exception: the
+ * dedicated initialiser picks {@link
StorageManipulationOpt#schemaCreateIfAbsent()}.
*/
public void tick(final boolean atBoot) {
try {
@@ -708,44 +710,38 @@ public final class DSLManager {
/**
* Pick the {@link StorageManipulationOpt} for a tick-driven apply.
*
- * <p>Two axes:
+ * <p>For runtime-rule (DSL) DDL the only axis that matters is <b>cluster
main-ness</b> —
+ * <em>not</em> the init / no-init / default running mode. The
running-mode axis governs
+ * <em>static</em> schema (the init OAP creates it, no-init OAPs wait); a
runtime rule is
+ * created at runtime and the init OAP never knows about it, so gating DSL
DDL on running
+ * mode would leave every production (no-init) cluster unable to apply
rules. no-init and
+ * default therefore behave identically here.
*
- * <p><b>RunningMode (boot/init context).</b>
+ * <p><b>init mode</b> — the one exception. The dedicated initialiser picks
+ * {@link StorageManipulationOpt#schemaCreateIfAbsent()}, matching the
static-rule install
+ * path (create-if-absent, idempotent against a backend that already holds
the resource).
+ *
+ * <p><b>Everything else (no-init or default)</b> — branch on main-ness:
* <ul>
- * <li>{@code init} mode — OAP is the dedicated initialiser; install
schema if
- * absent. {@link StorageManipulationOpt#schemaCreateIfAbsent()}
matches what the
- * rest of the static-rule install path does in init mode
(idempotent against
- * backends that already hold the table).
- * <li>{@code no-init} mode — this OAP must NOT touch the backend; the
init OAP
- * owns schema. The opt depends on whether this is the synchronous
boot pass
- * or a scheduled tick:
+ * <li>Self is main → {@link StorageManipulationOpt#withSchemaChange()}.
The authority
+ * creates / updates / drops backend schema. The boot pass uses this
too, so a main
+ * re-creates any missing runtime schema at startup.
+ * <li>Peer (someone else is main):
* <ul>
* <li><b>Boot pass</b> ({@code atBoot=true}) →
- * {@link StorageManipulationOpt#verifySchemaOnly()}. Strict:
backend
- * resources must already exist with the declared shape. A
missing or
- * mismatched schema fails the bootstrap (k8s pod backloop) —
operator must
- * bring up the init OAP first, or align rule files with the
backend.
+ * {@link StorageManipulationOpt#verifySchemaOnly()}. Strict:
refuse to start
+ * against a backend the main hasn't prepared (k8s pod backloop
until the main
+ * converges).
* <li><b>Scheduled tick</b> ({@code atBoot=false}) →
* {@link StorageManipulationOpt#withoutSchemaChange()}.
Lenient: the timer
- * retries forever without raising errors so transient absence
(init OAP
- * still catching up between ticks) self-heals.
+ * retries without raising so transient absence (main still
catching up between
+ * ticks) self-heals.
* </ul>
- * <li>default mode (regular running OAP) — branch on cluster main-ness,
see below.
- * </ul>
- *
- * <p><b>Cluster main-ness (default mode only).</b>
- * <ul>
- * <li>Self is main → {@link StorageManipulationOpt#withSchemaChange()}.
The REST path
- * has the same shape; tick rarely runs on main because REST usually
- * converges the main's state first.
- * <li>Peer (someone else is main) → {@link
StorageManipulationOpt#withoutSchemaChange()}.
- * Local MeterSystem + MetadataRegistry populate so the peer
dispatches samples
- * correctly, but no server-side DDL fires.
* </ul>
*
- * <p>When the cluster module isn't wired (embedded test topology), {@link
- * MainRouter#isSelfMain} returns {@code true} and the default-mode branch
falls
- * through to {@code withSchemaChange} — single-process deployments are
always main.
+ * <p>When the cluster module isn't wired (embedded / single-process
topology),
+ * {@link MainRouter#isSelfMain} returns {@code true} so we fall through to
+ * {@code withSchemaChange} — a single process is always its own main.
*
* @param atBoot true for the synchronous one-shot pass invoked from
* {@code RuntimeRuleModuleProvider.notifyAfterCompleted};
false for
@@ -755,20 +751,21 @@ public final class DSLManager {
if (RunningMode.isInitMode()) {
return StorageManipulationOpt.schemaCreateIfAbsent();
}
- if (RunningMode.isNoInitMode()) {
- return atBoot
- ? StorageManipulationOpt.verifySchemaOnly()
- : StorageManipulationOpt.withoutSchemaChange();
- }
+ final boolean selfMain;
try {
final AdminClusterChannelManager apm =
moduleManager.find(AdminServerModule.NAME).provider()
.getService(AdminClusterChannelManager.class);
- return MainRouter.isSelfMain(apm)
- ? StorageManipulationOpt.withSchemaChange()
- : StorageManipulationOpt.withoutSchemaChange();
+ selfMain = MainRouter.isSelfMain(apm);
} catch (final Throwable t) {
+ // Cluster module not wired (embedded / single-process) — always
main.
+ return StorageManipulationOpt.withSchemaChange();
+ }
+ if (selfMain) {
return StorageManipulationOpt.withSchemaChange();
}
+ return atBoot
+ ? StorageManipulationOpt.verifySchemaOnly()
+ : StorageManipulationOpt.withoutSchemaChange();
}
}
diff --git
a/oap-server/server-core/src/main/java/org/apache/skywalking/oap/server/core/storage/model/ModelInstaller.java
b/oap-server/server-core/src/main/java/org/apache/skywalking/oap/server/core/storage/model/ModelInstaller.java
index 735e7b379d..d65d7fb75b 100644
---
a/oap-server/server-core/src/main/java/org/apache/skywalking/oap/server/core/storage/model/ModelInstaller.java
+++
b/oap-server/server-core/src/main/java/org/apache/skywalking/oap/server/core/storage/model/ModelInstaller.java
@@ -89,10 +89,15 @@ public abstract class ModelInstaller implements
ModelRegistry.CreatingListener,
return;
}
- // Legacy poll loop for non-init OAPs that did not opt into the strict
verify
- // mode. Static models (boot-time) still take this path; runtime-rule
reconciler
- // explicitly chooses verify so this loop is bypassed.
- if (RunningMode.isNoInitMode()) {
+ // Poll loop for the STATIC boot-time path on a non-init OAP: the init
OAP owns
+ // schema creation, so this node waits until the resource appears
rather than
+ // creating it. Gated on deferDDLToInitNode (set only on
SCHEMA_CREATE_IF_ABSENT),
+ // NOT on RunningMode alone — a runtime-rule DSL apply
(withSchemaChange) is the
+ // operator/main-driven authority and must fall through to createTable
below
+ // regardless of no-init, because no init OAP knows about a metric
created at
+ // runtime. Without this, a no-init OAP would block here forever
waiting for a
+ // resource that only this very apply would ever create.
+ if (deferDDLToInitNode(opt)) {
while (true) {
InstallInfo info = isExists(model, opt);
if (!info.isAllExist()) {
@@ -148,6 +153,23 @@ public abstract class ModelInstaller implements
ModelRegistry.CreatingListener,
StorageManipulationOpt.Outcome.DROPPED, null);
}
+ /**
+ * True when this manipulation must defer all backend DDL to the dedicated
init OAP and
+ * wait for it, rather than create / update / reshape the resource on this
node. This is
+ * the single source of truth for the "no-init OAP doesn't own schema"
rule across the
+ * base installer and every backend subclass — call it instead of
re-checking
+ * {@link RunningMode#isNoInitMode()} inline, so the rule stays one
decision.
+ *
+ * <p>True only for the static boot-time {@link
StorageManipulationOpt#schemaCreateIfAbsent()}
+ * opt on a {@code no-init} OAP. The runtime-rule (DSL) opts leave
+ * {@link StorageManipulationOpt.Flags#isDeferDDLToInitNode()
deferDDLToInitNode} unset, so
+ * an operator-driven apply is governed by the opt's own create / update /
drop flags and
+ * by cluster main-ness — never by the init / no-init / default running
mode.
+ */
+ protected static boolean deferDDLToInitNode(final StorageManipulationOpt
opt) {
+ return RunningMode.isNoInitMode() &&
opt.getFlags().isDeferDDLToInitNode();
+ }
+
public void start() {
}
diff --git
a/oap-server/server-core/src/main/java/org/apache/skywalking/oap/server/core/storage/model/StorageManipulationOpt.java
b/oap-server/server-core/src/main/java/org/apache/skywalking/oap/server/core/storage/model/StorageManipulationOpt.java
index 3fe2e66eab..9b6d8cb04a 100644
---
a/oap-server/server-core/src/main/java/org/apache/skywalking/oap/server/core/storage/model/StorageManipulationOpt.java
+++
b/oap-server/server-core/src/main/java/org/apache/skywalking/oap/server/core/storage/model/StorageManipulationOpt.java
@@ -73,10 +73,11 @@ import lombok.Getter;
* <h3>{@link #verifySchemaOnly()} — {@link Mode#VERIFY_SCHEMA_ONLY}
(predicate: {@link #isVerifySchemaOnly()})</h3>
* <p>Callers:
* <ul>
- * <li>Boot-time reconciler pass on a non-init OAP — the operator declared
- * {@code init=false}, so this OAP must not perform DDL but must refuse
to start if
- * the backend isn't already in the shape the persisted runtime-rule
catalog
- * declares.</li>
+ * <li>Boot-time runtime-rule reconciler pass on a cluster <em>peer</em> (a
node that is
+ * not the hash-selected main for the file) — the main owns DDL, so this
node must not
+ * perform it but must refuse to start if the backend isn't already in
the shape the
+ * persisted runtime-rule catalog declares. Chosen by main-ness, not
running mode, so a
+ * peer behaves the same in no-init and default mode.</li>
* </ul>
* <p>Backend behaviour: read-only inspection. The installer issues the same
metadata
* read RPCs as {@link Mode#SCHEMA_CREATE_IF_ABSENT} but never invokes create
/ update / drop. On
@@ -137,15 +138,22 @@ public final class StorageManipulationOpt {
.escalateToCaller(true)
.build()),
/**
- * Static boot path on an init-mode OAP. Installer creates absent
resources, but
- * if a resource already exists with a shape that diverges from the
declared
- * model it records {@link Outcome#SKIPPED_SHAPE_MISMATCH} and does
<strong>not</strong>
- * call update / reshape. Operator must reconcile via the runtime-rule
REST
- * endpoint — boot is not allowed to silently mutate backend shape.
+ * Static boot-time model registration, run by every OAP. On an init /
standalone
+ * OAP the installer creates absent resources, but if a resource
already exists with
+ * a shape that diverges from the declared model it records
+ * {@link Outcome#SKIPPED_SHAPE_MISMATCH} and does
<strong>not</strong> call
+ * update / reshape. Operator must reconcile via the runtime-rule REST
endpoint —
+ * boot is not allowed to silently mutate backend shape.
+ *
+ * <p>This is the only mode that sets {@code deferDDLToInitNode}: on a
{@code no-init}
+ * OAP the installer defers to the init OAP (waits in the
+ * {@link ModelInstaller#whenCreating} poll loop) rather than creating
the resource
+ * itself. The runtime-rule (DSL) modes never defer.
*/
SCHEMA_CREATE_IF_ABSENT(Flags.builder()
.inspectBackend(true)
.createMissing(true)
+ .deferDDLToInitNode(true)
.build()),
/**
* Boot path on a non-init OAP. Installer issues the same read-only
inspection
@@ -247,6 +255,22 @@ public final class StorageManipulationOpt {
* the node.
*/
private final boolean escalateToCaller;
+ /**
+ * On a {@code no-init} OAP, defer all backend DDL to the dedicated
init OAP and wait
+ * (poll loop in {@link ModelInstaller#whenCreating}) rather than
create / update the
+ * resource here. Set ONLY on {@link Mode#SCHEMA_CREATE_IF_ABSENT} —
the static
+ * boot-time model registration that every OAP runs. The init /
no-init / default
+ * running-mode axis governs <strong>static</strong> schema only.
+ *
+ * <p>The runtime-rule (DSL) opts — {@link Mode#WITH_SCHEMA_CHANGE},
+ * {@link Mode#VERIFY_SCHEMA_ONLY}, {@link Mode#WITHOUT_SCHEMA_CHANGE}
— leave this
+ * {@code false}, so an operator-driven runtime apply is driven by the
other flags and
+ * by cluster main-ness, never by {@code RunningMode}. Without this
distinction a
+ * no-init OAP (every production cluster node) would route a runtime
{@code withSchemaChange}
+ * create into the init-node poll loop and block forever, because no
init OAP knows
+ * about a metric that was created at runtime.
+ */
+ private final boolean deferDDLToInitNode;
}
@Getter
diff --git
a/oap-server/server-core/src/test/java/org/apache/skywalking/oap/server/core/storage/model/ModelInstallerNoInitTest.java
b/oap-server/server-core/src/test/java/org/apache/skywalking/oap/server/core/storage/model/ModelInstallerNoInitTest.java
new file mode 100644
index 0000000000..d9cda58cd7
--- /dev/null
+++
b/oap-server/server-core/src/test/java/org/apache/skywalking/oap/server/core/storage/model/ModelInstallerNoInitTest.java
@@ -0,0 +1,140 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ *
+ */
+
+package org.apache.skywalking.oap.server.core.storage.model;
+
+import java.time.Duration;
+import org.apache.skywalking.oap.server.core.RunningMode;
+import org.apache.skywalking.oap.server.core.storage.StorageException;
+import org.junit.jupiter.api.AfterEach;
+import org.junit.jupiter.api.Test;
+
+import static org.junit.jupiter.api.Assertions.assertEquals;
+import static org.junit.jupiter.api.Assertions.assertFalse;
+import static org.junit.jupiter.api.Assertions.assertTimeoutPreemptively;
+import static org.junit.jupiter.api.Assertions.assertTrue;
+import static org.mockito.Mockito.mock;
+import static org.mockito.Mockito.when;
+
+/**
+ * Regression guard for the runtime-rule (DSL) schema-change path on a {@code
no-init} OAP —
+ * every production cluster node runs no-init. The base {@link
ModelInstaller#whenCreating}
+ * poll loop must defer to the init OAP only for the static boot-time opt
+ * ({@link StorageManipulationOpt#schemaCreateIfAbsent()}); a runtime-rule
+ * {@link StorageManipulationOpt#withSchemaChange()} apply must fall through to
+ * {@code createTable} and create the resource itself, because no init OAP
knows about a
+ * metric created at runtime. Before the {@code deferDDLToInitNode} flag, a
no-init OAP
+ * routed the runtime create into the poll loop and blocked forever.
+ */
+class ModelInstallerNoInitTest {
+
+ @AfterEach
+ void resetRunningMode() {
+ // RunningMode is a process-wide static; setMode("") is a no-op, so
reset to a
+ // neutral non-init/non-no-init value to avoid leaking no-init into
other tests.
+ RunningMode.setMode("default");
+ }
+
+ @Test
+ void deferFlagSetOnlyOnStaticBootOpt() {
+
assertTrue(StorageManipulationOpt.schemaCreateIfAbsent().getFlags().isDeferDDLToInitNode(),
+ "static boot opt must defer DDL to the init node");
+
assertFalse(StorageManipulationOpt.withSchemaChange().getFlags().isDeferDDLToInitNode(),
+ "runtime-rule withSchemaChange must NOT defer — it is the DDL
authority");
+
assertFalse(StorageManipulationOpt.verifySchemaOnly().getFlags().isDeferDDLToInitNode());
+
assertFalse(StorageManipulationOpt.withoutSchemaChange().getFlags().isDeferDDLToInitNode());
+ }
+
+ @Test
+ void noInitMainCreatesNewMetricUnderWithSchemaChange() {
+ RunningMode.setMode("no-init");
+ final RecordingInstaller installer = new RecordingInstaller(false /*
resource absent */);
+ final Model model = mock(Model.class);
+ when(model.getName()).thenReturn("runtime_metric");
+
+ // Must return (not spin in the no-init poll loop) and must create the
resource. The
+ // preemptive timeout turns a regression — the historical infinite
wait — into a fast
+ // failure instead of a hung build.
+ assertTimeoutPreemptively(Duration.ofSeconds(10), () ->
+ installer.whenCreating(model,
StorageManipulationOpt.withSchemaChange()));
+ assertEquals(1, installer.createTableCalls,
+ "runtime withSchemaChange on a no-init OAP must create the new
resource");
+ }
+
+ @Test
+ void noInitStaticBootDefersToInitNode() throws StorageException {
+ RunningMode.setMode("no-init");
+ // Resource already present so the defer poll loop breaks on its first
probe instead
+ // of waiting forever — lets the test assert the defer path without
hanging.
+ final RecordingInstaller installer = new RecordingInstaller(true /*
resource present */);
+ final Model model = mock(Model.class);
+ when(model.getName()).thenReturn("static_metric");
+
+ installer.whenCreating(model,
StorageManipulationOpt.schemaCreateIfAbsent());
+ assertEquals(0, installer.createTableCalls,
+ "static boot on a no-init OAP must defer to the init node, never
create");
+ }
+
+ @Test
+ void withSchemaChangeSkipsCreateWhenResourceAlreadyExists() throws
StorageException {
+ RunningMode.setMode("no-init");
+ final RecordingInstaller installer = new RecordingInstaller(true /*
resource present */);
+ final Model model = mock(Model.class);
+ when(model.getName()).thenReturn("existing_metric");
+
+ installer.whenCreating(model,
StorageManipulationOpt.withSchemaChange());
+ assertEquals(0, installer.createTableCalls,
+ "withSchemaChange must not re-create a resource that already
exists");
+ }
+
+ /** Minimal concrete {@link ModelInstaller} that records createTable calls
and reports a
+ * fixed existence result, so the base whenCreating branching can be
exercised without a
+ * real storage backend. */
+ private static final class RecordingInstaller extends ModelInstaller {
+ private final boolean resourcePresent;
+ private int createTableCalls;
+
+ private RecordingInstaller(final boolean resourcePresent) {
+ super(null, null);
+ this.resourcePresent = resourcePresent;
+ }
+
+ @Override
+ public InstallInfo isExists(final Model model, final
StorageManipulationOpt opt) {
+ final TestInstallInfo info = new TestInstallInfo(model);
+ info.setAllExist(resourcePresent);
+ return info;
+ }
+
+ @Override
+ public void createTable(final Model model) {
+ createTableCalls++;
+ }
+ }
+
+ private static final class TestInstallInfo extends
ModelInstaller.InstallInfo {
+ private TestInstallInfo(final Model model) {
+ super(model);
+ }
+
+ @Override
+ public String buildInstallInfoMsg() {
+ return "test";
+ }
+ }
+}
diff --git
a/oap-server/server-storage-plugin/storage-banyandb-plugin/src/main/java/org/apache/skywalking/oap/server/storage/plugin/banyandb/BanyanDBIndexInstaller.java
b/oap-server/server-storage-plugin/storage-banyandb-plugin/src/main/java/org/apache/skywalking/oap/server/storage/plugin/banyandb/BanyanDBIndexInstaller.java
index 47ceac8df3..cabfb75276 100644
---
a/oap-server/server-storage-plugin/storage-banyandb-plugin/src/main/java/org/apache/skywalking/oap/server/storage/plugin/banyandb/BanyanDBIndexInstaller.java
+++
b/oap-server/server-storage-plugin/storage-banyandb-plugin/src/main/java/org/apache/skywalking/oap/server/storage/plugin/banyandb/BanyanDBIndexInstaller.java
@@ -144,11 +144,13 @@ public class BanyanDBIndexInstaller extends
ModelInstaller {
installInfo.setAllExist(false);
return installInfo;
} else {
- // Run shape-compat checks unless we're in the legacy no-init
poll loop
- // path. failOnAbsence implies the caller wants strict
verification even
- // in non-init mode (VERIFY_SCHEMA_ONLY), so honour that
instead of just
- // gating on RunningMode.
- final boolean runShapeChecks = !RunningMode.isNoInitMode() ||
opt.getFlags().isFailOnAbsence();
+ // Run shape-compat checks — and the updates they drive for
withSchemaChange —
+ // unless this is the static boot-time path deferring to the
init OAP. The
+ // runtime-rule DSL opts (withSchemaChange / verifySchemaOnly)
are never
+ // deferred, so an operator-driven shape UPDATE reconciles on
a no-init OAP
+ // exactly as on a default / standalone one. (verifySchemaOnly
still runs the
+ // checks but records SKIPPED_SHAPE_MISMATCH instead of
writing.)
+ final boolean runShapeChecks = !deferDDLToInitNode(opt);
if (model.isTimeSeries()) {
// register models only locally(Schema cache) but not
remotely
if (model.isRecord()) {
@@ -637,10 +639,15 @@ public class BanyanDBIndexInstaller extends
ModelInstaller {
optsBuilder.addAllDefaultStages(metadata.getResource().getDefaultQueryStages());
}
gBuilder.setResourceOpts(optsBuilder.build());
- if (!RunningMode.isNoInitMode()) {
- if (!groupAligned.contains(metadata.getGroup())) {
+ // Group DDL follows the opt, not RunningMode: a runtime-rule
withSchemaChange
+ // creates / updates the group on whatever node reaches here (peers
short-circuit
+ // earlier via inspectBackend=false), while the static boot path
defers to the init
+ // OAP on no-init. Create is gated on createMissing and update on
!failOnShapeMismatch
+ // so verifySchemaOnly stays read-only even though it is not deferred.
+ if (!deferDDLToInitNode(opt) &&
!groupAligned.contains(metadata.getGroup())) {
+ if (!resourceExist.isHasGroup()) {
// create the group if not exist
- if (!resourceExist.isHasGroup()) {
+ if (opt.getFlags().isCreateMissing()) {
try {
Group g = client.define(gBuilder.build());
if (g != null) {
@@ -653,16 +660,16 @@ public class BanyanDBIndexInstaller extends
ModelInstaller {
throw ex;
}
}
- } else {
- // update the group if necessary
- if (this.checkGroup(metadata, client)) {
- opt.recordModRevision(client.update(gBuilder.build()));
- log.info("group {} updated", metadata.getGroup());
- }
}
- // mark the group as aligned
- groupAligned.add(metadata.getGroup());
+ } else {
+ // update the group if necessary
+ if (!opt.getFlags().isFailOnShapeMismatch() &&
this.checkGroup(metadata, client)) {
+ opt.recordModRevision(client.update(gBuilder.build()));
+ log.info("group {} updated", metadata.getGroup());
+ }
}
+ // mark the group as aligned
+ groupAligned.add(metadata.getGroup());
}
return resourceExist;
}
diff --git a/test/e2e-v2/cases/runtime-rule/cluster/cluster-flow.sh
b/test/e2e-v2/cases/runtime-rule/cluster/cluster-flow.sh
index ac371559cb..0740a9947c 100755
--- a/test/e2e-v2/cases/runtime-rule/cluster/cluster-flow.sh
+++ b/test/e2e-v2/cases/runtime-rule/cluster/cluster-flow.sh
@@ -15,17 +15,26 @@
# See the License for the specific language governing permissions and
# limitations under the License.
-# Drives a runtime-rule apply on OAP-1 and asserts OAP-2 converges on the same
-# (catalog, name, contentHash) within the reconciler tick window. Run from the
-# repo root.
+# Runtime-rule lifecycle + cross-node convergence on a Kubernetes (kind)
cluster
+# deployed in NO-INIT mode — the topology every production SkyWalking cluster
runs
+# (a one-shot `-Dmode=init` Job creates static schema, the OAP Deployment runs
+# `-Dmode=no-init`). This is the deployment that exercises the runtime-rule
+# schema-change path on a no-init node: applying a NEW MAL rule must drive the
+# backend DDL (create the BanyanDB measure) on the cluster main even though it
is a
+# no-init OAP — the init Job never knew about a metric created at runtime, so
the
+# main is the only node that can create it.
#
-# Coverage:
-# 1. Apply seed-rule on OAP-1 → ACTIVE
-# 2. Wait for OAP-2 to see the rule via /list (one tick = ~30 s default)
-# 3. STRUCTURAL update on OAP-1 → re-converge on OAP-2 (different content
hash)
+# Coverage (drive on OAP-1, observe convergence on OAP-2 within a reconciler
tick):
+# 1. Apply seed-rule on OAP-1 → ACTIVE (NEW: first-time measure creation on
no-init)
+# 2. OAP-2 converges on the same (status, contentHash)
+# 3. STRUCTURAL update on OAP-1 → re-converge on OAP-2 (new metric, new
measure)
# 4. Inactivate on OAP-1 → INACTIVE on OAP-2
# 5. Delete on OAP-1 → row gone on OAP-2
#
+# The pre-fix bug: on a no-init OAP the apply blocked forever in the storage
+# installer's init-node poll loop and never created the measure, so step 1
never
+# reached ACTIVE. Reaching ACTIVE here is the end-to-end regression assertion.
+#
# Failures route to stderr so the e2e harness's stdout capture stays clean.
set -euo pipefail
@@ -33,26 +42,66 @@ set -euo pipefail
log() { echo "[cluster-flow] $*" >&2; }
fail() { log "FAIL: $*"; exit 1; }
+NS="${SW_NAMESPACE:-skywalking}"
+# Pod-template labels set by the skywalking-helm OAP Deployment (release name
= skywalking).
+OAP_SELECTOR="${OAP_SELECTOR:-app=skywalking,component=oap,release=skywalking}"
OAP1_PORT="${OAP1_PORT:-17128}"
OAP2_PORT="${OAP2_PORT:-17129}"
OAP1_BASE="http://127.0.0.1:${OAP1_PORT}"
OAP2_BASE="http://127.0.0.1:${OAP2_PORT}"
+# Admin REST port inside each OAP container (SW_ADMIN_SERVER=default).
+ADMIN_CONTAINER_PORT="${ADMIN_CONTAINER_PORT:-17128}"
+
SEED_DIR="${SEED_DIR:-$(pwd)/test/e2e-v2/cases/runtime-rule/mal-storage/seed-rules}"
SEED_NEW="${SEED_DIR}/seed-rule.yaml"
SEED_STRUCT="${SEED_DIR}/seed-rule-structural.yaml"
CATALOG="otel-rules"
NAME="cluster_rr"
-# Two ticks worth — default reconciler interval is 30 s; allow a generous 90 s
for
-# convergence on a busy CI host.
-CONVERGE_TIMEOUT_S="${CONVERGE_TIMEOUT_S:-90}"
+# Generous on a kind host: two reconciler ticks (default 30 s) + BanyanDB
schema
+# propagation + RPC jitter.
+CONVERGE_TIMEOUT_S="${CONVERGE_TIMEOUT_S:-120}"
[ -f "${SEED_NEW}" ] || fail "seed-rule.yaml missing at ${SEED_NEW}"
+[ -f "${SEED_STRUCT}" ] || fail "seed-rule-structural.yaml missing at
${SEED_STRUCT}"
+
+# --- Discover the two OAP pods and port-forward each node's admin REST
-------------
+# The OAP Deployment runs >= 2 replicas behind one Service; the Service
load-balances,
+# so addressing individual nodes (to assert cross-node convergence) needs
per-pod
+# forwards rather than a single Service forward.
+log "waiting for >= 2 ready OAP pods in ns/${NS} (selector: ${OAP_SELECTOR})"
+deadline=$(( $(date +%s) + 300 ))
+PODS=()
+while true; do
+ # Only Ready pods — a no-init OAP keeps port 12800 closed (and stays
NotReady)
+ # until the init Job has created the static schema. Read into an array
without
+ # mapfile/readarray so the script runs under macOS bash 3.2 as well as CI
bash 4+.
+ PODS=()
+ while IFS= read -r _pod; do
+ [ -n "${_pod}" ] && PODS+=("${_pod}")
+ done < <(kubectl -n "${NS}" get pods -l "${OAP_SELECTOR}" \
+ -o jsonpath='{range .items[*]}{range
@.status.conditions[?(@.type=="Ready")]}{@.status}{end}
{.metadata.name}{"\n"}{end}' \
+ 2>/dev/null | awk '$1=="True"{print $2}')
+ if [ "${#PODS[@]}" -ge 2 ]; then
+ break
+ fi
+ if [ "$(date +%s)" -ge "${deadline}" ]; then
+ kubectl -n "${NS}" get pods -l "${OAP_SELECTOR}" >&2 || true
+ fail "fewer than 2 ready OAP pods after 300s (got ${#PODS[@]})"
+ fi
+ sleep 5
+done
+POD1="${PODS[0]}"
+POD2="${PODS[1]}"
+log "OAP pods: OAP-1=${POD1} OAP-2=${POD2}"
-# All runtime-rule REST calls go through swctl's `admin` command tree instead
of
-# raw curl. This flow drives two OAP nodes, so the admin host (`--admin-url`)
is
-# passed per call as the first argument. `--display json` keeps the body shape
-# identical to the old curl output, so the jq assertions are unchanged.
+kubectl -n "${NS}" port-forward "pod/${POD1}"
"${OAP1_PORT}:${ADMIN_CONTAINER_PORT}" >/dev/null 2>&1 &
+PF1=$!
+kubectl -n "${NS}" port-forward "pod/${POD2}"
"${OAP2_PORT}:${ADMIN_CONTAINER_PORT}" >/dev/null 2>&1 &
+PF2=$!
+trap 'kill "${PF1}" "${PF2}" 2>/dev/null || true' EXIT
+
+# --- swctl admin helpers (per-node --admin-url)
------------------------------------
admin() { local base="$1"; shift; swctl --display json --admin-url="${base}"
admin "$@"; }
list_row() {
@@ -64,26 +113,17 @@ list_row() {
| head -1
}
-list_status() {
- local base="$1"
- list_row "${base}" | jq -r '.status // empty'
-}
-
-list_hash() {
- local base="$1"
- list_row "${base}" | jq -r '.contentHash // empty'
-}
+list_status() { list_row "$1" | jq -r '.status // empty'; }
+list_hash() { list_row "$1" | jq -r '.contentHash // empty'; }
+list_apply_error() { list_row "$1" | jq -r '.lastApplyError // empty'; }
await_status() {
local base="$1" expected="$2" deadline=$(( $(date +%s) +
CONVERGE_TIMEOUT_S ))
while true; do
- local got
- got="$(list_status "${base}")"
- if [ "${got}" = "${expected}" ]; then
- return 0
- fi
+ local got; got="$(list_status "${base}")"
+ [ "${got}" = "${expected}" ] && return 0
if [ "$(date +%s)" -ge "${deadline}" ]; then
- fail "${base} did not reach status='${expected}' within
${CONVERGE_TIMEOUT_S}s (last='${got}')"
+ fail "${base} did not reach status='${expected}' within
${CONVERGE_TIMEOUT_S}s (last='${got}', applyError='$(list_apply_error
"${base}")')"
fi
sleep 2
done
@@ -92,11 +132,8 @@ await_status() {
await_hash() {
local base="$1" expected_hash="$2" deadline=$(( $(date +%s) +
CONVERGE_TIMEOUT_S ))
while true; do
- local got
- got="$(list_hash "${base}")"
- if [ "${got}" = "${expected_hash}" ]; then
- return 0
- fi
+ local got; got="$(list_hash "${base}")"
+ [ "${got}" = "${expected_hash}" ] && return 0
if [ "$(date +%s)" -ge "${deadline}" ]; then
fail "${base} did not converge to
contentHash='${expected_hash:0:8}…' within ${CONVERGE_TIMEOUT_S}s
(last='${got:0:8}…')"
fi
@@ -107,9 +144,7 @@ await_hash() {
await_absent() {
local base="$1" deadline=$(( $(date +%s) + CONVERGE_TIMEOUT_S ))
while true; do
- if [ -z "$(list_row "${base}")" ]; then
- return 0
- fi
+ [ -z "$(list_row "${base}")" ] && return 0
if [ "$(date +%s)" -ge "${deadline}" ]; then
fail "${base} did not drop row within ${CONVERGE_TIMEOUT_S}s"
fi
@@ -117,43 +152,46 @@ await_absent() {
done
}
+assert_no_apply_error() {
+ local base="$1" err; err="$(list_apply_error "${base}")"
+ [ -z "${err}" ] || fail "${base} reports lastApplyError='${err}' (no-init
schema change failed)"
+}
+
apply_on() {
local base="$1" body="$2" extra="${3:-}"
local -a flags=(--catalog "${CATALOG}" --name "${NAME}" -f "${body}")
[[ "${extra}" == *allowStorageChange=true* ]] &&
flags+=(--allow-storage-change)
- local resp; resp="$(admin "${base}" runtime-rule add "${flags[@]}")" \
- || fail "addOrUpdate against ${base} failed"
- echo "${resp}"
+ admin "${base}" runtime-rule add "${flags[@]}" || fail "addOrUpdate
against ${base} failed"
}
-# --- Wait for both OAPs to come up
-------------------------------------------------
-log "waiting for OAP-1 (${OAP1_BASE})"
-deadline=$(( $(date +%s) + 120 ))
-until admin "${OAP1_BASE}" runtime-rule list >/dev/null 2>&1; do
- if [ "$(date +%s)" -ge "${deadline}" ]; then fail "OAP-1 not ready after
120s"; fi
- sleep 2
-done
-log "waiting for OAP-2 (${OAP2_BASE})"
-deadline=$(( $(date +%s) + 120 ))
-until admin "${OAP2_BASE}" runtime-rule list >/dev/null 2>&1; do
- if [ "$(date +%s)" -ge "${deadline}" ]; then fail "OAP-2 not ready after
120s"; fi
- sleep 2
+# --- Wait for both OAPs' admin REST to answer through the forwards
-----------------
+for pair in "OAP-1 ${OAP1_BASE}" "OAP-2 ${OAP2_BASE}"; do
+ set -- ${pair}; label="$1"; base="$2"
+ log "waiting for ${label} admin REST (${base})"
+ deadline=$(( $(date +%s) + 120 ))
+ until admin "${base}" runtime-rule list >/dev/null 2>&1; do
+ if [ "$(date +%s)" -ge "${deadline}" ]; then fail "${label} admin not
ready after 120s"; fi
+ sleep 2
+ done
done
-log "both OAPs ready"
+log "both OAP admin endpoints ready"
-# --- Phase 1: apply on OAP-1, observe convergence on OAP-2
-------------------------
-log "=== Phase 1: apply (NEW) on OAP-1 ==="
+# --- Phase 1: apply NEW on OAP-1 — first-time measure creation on a no-init
node ----
+log "=== Phase 1: apply (NEW) on OAP-1 — exercises no-init schema creation ==="
apply_on "${OAP1_BASE}" "${SEED_NEW}" >/dev/null
await_status "${OAP1_BASE}" "ACTIVE"
+assert_no_apply_error "${OAP1_BASE}"
hash_initial="$(list_hash "${OAP1_BASE}")"
-log "OAP-1 → ACTIVE @ ${hash_initial:0:8}…"
+log "OAP-1 → ACTIVE @ ${hash_initial:0:8}… (measure created on a no-init OAP)"
await_status "${OAP2_BASE}" "ACTIVE"
await_hash "${OAP2_BASE}" "${hash_initial}"
log "OAP-2 converged to ${hash_initial:0:8}…"
-# --- Phase 2: STRUCTURAL update on OAP-1, observe new hash on OAP-2
----------------
+# --- Phase 2: STRUCTURAL update on OAP-1 — second measure created on no-init
--------
log "=== Phase 2: STRUCTURAL on OAP-1 ==="
apply_on "${OAP1_BASE}" "${SEED_STRUCT}" "allowStorageChange=true" >/dev/null
+await_status "${OAP1_BASE}" "ACTIVE"
+assert_no_apply_error "${OAP1_BASE}"
hash_struct="$(list_hash "${OAP1_BASE}")"
[ "${hash_struct}" != "${hash_initial}" ] || fail "OAP-1 contentHash unchanged
after STRUCTURAL apply"
log "OAP-1 → ACTIVE @ ${hash_struct:0:8}… (was ${hash_initial:0:8}…)"
@@ -178,4 +216,33 @@ log "OAP-1 → row gone"
await_absent "${OAP2_BASE}"
log "OAP-2 converged: row gone"
-log "=== ALL CLUSTER PHASES PASSED ==="
+# --- Phase 5: forward-path coverage — drive a write on OAP-2
-----------------------
+# Phases 1-4 drove OAP-1; whether that exercised the cross-node Forward
depends on which
+# node the hash-router picked as main. Driving a write on OAP-2 as well
guarantees the
+# Forward path is exercised regardless: whichever of OAP-1 / OAP-2 is NOT the
main forwards
+# the write to the main. This is the path that regressed on Kubernetes — every
replica
+# shared selfNodeId=0.0.0.0_11800 (the 0.0.0.0 gRPC bind host), so the main's
self-loop
+# guard rejected a legitimate forward as HTTP 400 forward_self_loop. With a
unique per-pod
+# id the forward completes; a failure here (esp. forward_self_loop) re-opens
that bug.
+NAME_B="cluster_rr_fwd"
+log "=== Phase 5: apply on OAP-2 (guarantees cross-node Forward coverage) ==="
+admin "${OAP2_BASE}" runtime-rule add --catalog "${CATALOG}" --name
"${NAME_B}" -f "${SEED_NEW}" >/dev/null \
+ || fail "addOrUpdate on OAP-2 failed — cross-node Forward broken (e.g.
forward_self_loop)?"
+b_deadline=$(( $(date +%s) + CONVERGE_TIMEOUT_S ))
+while true; do
+ b_status="$(admin "${OAP2_BASE}" runtime-rule list 2>/dev/null \
+ | jq -r '.rules[] | select(.catalog=="'"${CATALOG}"'" and
.name=="'"${NAME_B}"'") | .status' | head -1)"
+ [ "${b_status}" = "ACTIVE" ] && break
+ [ "$(date +%s)" -ge "${b_deadline}" ] && fail "OAP-2 write did not reach
ACTIVE within ${CONVERGE_TIMEOUT_S}s (last='${b_status}')"
+ sleep 2
+done
+log "OAP-2 write → ACTIVE (cross-node Forward path OK)"
+# Cleanup also forwards from OAP-2: inactivate (soft-pause) is required before
delete,
+# so this exercises the Forward path for the inactivate + delete operations
too.
+admin "${OAP2_BASE}" runtime-rule inactivate --catalog "${CATALOG}" --name
"${NAME_B}" >/dev/null \
+ || fail "inactivate of ${NAME_B} on OAP-2 failed"
+admin "${OAP2_BASE}" runtime-rule delete --catalog "${CATALOG}" --name
"${NAME_B}" >/dev/null \
+ || fail "cleanup delete of ${NAME_B} on OAP-2 failed"
+log "Phase 5 cleanup done (inactivate + delete forwarded OK)"
+
+log "=== ALL CLUSTER (kind) PHASES PASSED ==="
diff --git a/test/e2e-v2/cases/runtime-rule/cluster/docker-compose.yml
b/test/e2e-v2/cases/runtime-rule/cluster/docker-compose.yml
deleted file mode 100644
index c82047c541..0000000000
--- a/test/e2e-v2/cases/runtime-rule/cluster/docker-compose.yml
+++ /dev/null
@@ -1,93 +0,0 @@
-# Licensed to the Apache Software Foundation (ASF) under one or more
-# contributor license agreements. See the NOTICE file distributed with
-# this work for additional information regarding copyright ownership.
-# The ASF licenses this file to You under the Apache License, Version 2.0
-# (the "License"); you may not use this file except in compliance with
-# the License. You may obtain a copy of the License at
-#
-# http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-
-# Cluster convergence — 2 OAPs behind a ZooKeeper coordinator + BanyanDB.
-# Verifies that a runtime-rule apply on one node propagates to the other
within a
-# reconciler tick (default 30 s) and that the Suspend / Resume RPC bracket
dispatch
-# correctly across the cluster.
-services:
- zookeeper:
- image: zookeeper:3.8
- networks:
- - e2e
- environment:
- ZOO_4LW_COMMANDS_WHITELIST: "ruok,stat,srvr"
- healthcheck:
- # Use the zookeeper-shell.sh ls wrapper (image's own /bin) — the official
- # zookeeper:3.8 image does not ship `nc`, so the more obvious `echo ruok
| nc ...`
- # idiom fails. zkServer.sh status returns 0 once the server is in
standalone /
- # leader mode.
- test: ["CMD-SHELL", "zkServer.sh status 2>/dev/null | grep -E 'Mode:
(standalone|leader|follower)'"]
- interval: 5s
- timeout: 10s
- retries: 30
-
- banyandb:
- extends:
- file: ../../../script/docker-compose/base-compose.yml
- service: banyandb
-
- oap1:
- extends:
- file: ../../../script/docker-compose/base-compose.yml
- service: oap
- hostname: oap1
- environment:
- SW_ADMIN_SERVER: default
- SW_RECEIVER_RUNTIME_RULE: default
- SW_STORAGE: banyandb
- SW_CLUSTER: zookeeper
- SW_CLUSTER_ZK_HOST_PORT: zookeeper:2181
- # First-up node also doubles as the static-rule installer; nothing to
coordinate
- # with peers on storage init.
- ports:
- - "11800:11800"
- - "12800:12800"
- - "17128:17128"
- depends_on:
- zookeeper:
- condition: service_healthy
- banyandb:
- condition: service_healthy
- networks:
- - e2e
-
- oap2:
- extends:
- file: ../../../script/docker-compose/base-compose.yml
- service: oap
- hostname: oap2
- environment:
- SW_ADMIN_SERVER: default
- SW_RECEIVER_RUNTIME_RULE: default
- SW_STORAGE: banyandb
- SW_CLUSTER: zookeeper
- SW_CLUSTER_ZK_HOST_PORT: zookeeper:2181
- ports:
- - "11801:11800"
- - "12801:12800"
- - "17129:17128"
- depends_on:
- zookeeper:
- condition: service_healthy
- banyandb:
- condition: service_healthy
- oap1:
- condition: service_healthy
- networks:
- - e2e
-
-networks:
- e2e:
diff --git a/test/e2e-v2/cases/runtime-rule/cluster/e2e.yaml
b/test/e2e-v2/cases/runtime-rule/cluster/e2e.yaml
index d6b82f0c8d..d0dbc3bee9 100644
--- a/test/e2e-v2/cases/runtime-rule/cluster/e2e.yaml
+++ b/test/e2e-v2/cases/runtime-rule/cluster/e2e.yaml
@@ -13,17 +13,32 @@
# See the License for the specific language governing permissions and
# limitations under the License.
-# 2-OAP cluster + ZK + BanyanDB. Drives apply / inactivate / delete on OAP-1
and
-# verifies OAP-2 converges within a reconciler tick (default 30 s).
+# Runtime-rule lifecycle + cross-node convergence on a Kubernetes (kind)
cluster
+# deployed via the skywalking-helm chart — a 2-replica OAP Deployment running
in
+# NO-INIT mode (`-Dmode=no-init`) behind a one-shot `-Dmode=init` schema-init
Job,
+# the exact topology every production SkyWalking cluster uses. This is the
case that
+# exercises the runtime-rule schema-change path on a no-init OAP: an operator
apply
+# must drive backend DDL (create the BanyanDB measure) on the cluster main even
+# though it is a no-init node. BanyanDB native cluster coordination
(SW_CLUSTER=
+# kubernetes) is wired by the chart; ZooKeeper is not needed.
setup:
- env: compose
- file: docker-compose.yml
- timeout: 25m
+ env: kind
+ file: kind.yaml
+ timeout: 30m
init-system-environment: ../../../script/env
+ kind:
+ import-images:
+ - skywalking/oap:latest
steps:
- name: set PATH
command: export PATH=/tmp/skywalking-infra-e2e/bin:$PATH
+ - name: install yq
+ command: bash test/e2e-v2/script/prepare/setup-e2e-shell/install.sh yq
+ - name: install swctl
+ command: bash test/e2e-v2/script/prepare/setup-e2e-shell/install.sh swctl
+ - name: install kubectl
+ command: bash test/e2e-v2/script/prepare/setup-e2e-shell/install.sh
kubectl
- name: install jq
command: |
if ! command -v jq >/dev/null 2>&1; then
@@ -31,11 +46,42 @@ setup:
https://github.com/jqlang/jq/releases/download/jq-1.7.1/jq-linux-amd64
chmod +x /tmp/skywalking-infra-e2e/bin/jq
fi
- - name: install swctl
- command: bash test/e2e-v2/script/prepare/setup-e2e-shell/install.sh swctl
- - name: drive cluster convergence flow
+ - name: install helm
+ command: bash test/e2e-v2/script/prepare/setup-e2e-shell/install.sh helm
+ # 2-replica OAP Deployment (no-init) + init Job + admin server +
runtime-rule
+ # receiver + BanyanDB. fullnameOverride=skywalking makes the OAP Service /
pods
+ # discoverable as skywalking-oap with labels app=skywalking,component=oap.
+ - name: install SkyWalking (no-init cluster + BanyanDB) via helm
+ command: |
+ export PATH=/tmp/skywalking-infra-e2e/bin:$PATH
+ helm -n skywalking install skywalking \
+ oci://ghcr.io/apache/skywalking-helm/skywalking-helm \
+ --version "0.0.0-${SW_KUBERNETES_COMMIT_SHA}" \
+ --create-namespace \
+ --set fullnameOverride=skywalking \
+ --set elasticsearch.enabled=false \
+ --set oap.replicas=2 \
+ --set oap.image.repository=skywalking/oap \
+ --set oap.image.tag=latest \
+ --set oap.imagePullPolicy=IfNotPresent \
+ --set oap.storageType=banyandb \
+ --set oap.env.SW_ADMIN_SERVER=default \
+ --set oap.env.SW_RECEIVER_RUNTIME_RULE=default \
+ --set ui.enabled=false \
+ --set banyandb.enabled=true \
+ --set banyandb.standalone.enabled=true \
+ --set banyandb.cluster.enabled=false \
+ --set banyandb.image.repository=ghcr.io/apache/skywalking-banyandb \
+ --set banyandb.image.tag=${SW_BANYANDB_COMMIT}
+ wait:
+ # The init Job must complete (creates the static schema) before the
no-init
+ # OAP Deployment can become Available.
+ - namespace: skywalking
+ resource: deployment/skywalking-oap
+ for: condition=available
+ timeout: 20m
+ - name: drive runtime-rule lifecycle + cross-node convergence (no-init)
command: |
- set -euo pipefail
export PATH=/tmp/skywalking-infra-e2e/bin:$PATH
bash test/e2e-v2/cases/runtime-rule/cluster/cluster-flow.sh
@@ -44,18 +90,8 @@ verify:
count: 1
interval: 1s
cases:
- - query: swctl --display json --admin-url=http://127.0.0.1:17128 admin
runtime-rule list >/dev/null && echo ok
- expected: expected/ok.txt
-
-cleanup:
- on: always
- collect:
- on: failure
- output-dir: $SW_INFRA_E2E_LOG_DIR/runtime-rule/cluster
- items:
- - service: oap1
- paths:
- - /skywalking/logs/
- - service: oap2
- paths:
- - /skywalking/logs/
+ # The lifecycle assertions live in cluster-flow.sh (it exits non-zero on
any
+ # failure, failing setup). This is a thin liveness check that the no-init
cluster
+ # came up with both OAP replicas Ready.
+ - query: kubectl -n skywalking get deployment skywalking-oap -o
jsonpath='{.status.readyReplicas}'
+ expected: expected/ready-replicas.txt
diff --git a/test/e2e-v2/cases/runtime-rule/cluster/expected/ok.txt
b/test/e2e-v2/cases/runtime-rule/cluster/expected/ok.txt
deleted file mode 100644
index 9766475a41..0000000000
--- a/test/e2e-v2/cases/runtime-rule/cluster/expected/ok.txt
+++ /dev/null
@@ -1 +0,0 @@
-ok
diff --git a/test/e2e-v2/cases/runtime-rule/cluster/expected/ready-replicas.txt
b/test/e2e-v2/cases/runtime-rule/cluster/expected/ready-replicas.txt
new file mode 100644
index 0000000000..d8263ee986
--- /dev/null
+++ b/test/e2e-v2/cases/runtime-rule/cluster/expected/ready-replicas.txt
@@ -0,0 +1 @@
+2
\ No newline at end of file
diff --git a/test/e2e-v2/cases/runtime-rule/cluster/kind.yaml
b/test/e2e-v2/cases/runtime-rule/cluster/kind.yaml
new file mode 100644
index 0000000000..a57ada7120
--- /dev/null
+++ b/test/e2e-v2/cases/runtime-rule/cluster/kind.yaml
@@ -0,0 +1,23 @@
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements. See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License. You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+# Single-node kind cluster for the runtime-rule no-init cluster e2e. One node
is
+# enough — the OAP Deployment's replicas (the no-init cluster) and the
BanyanDB pod
+# all schedule here. Node image pinned to the same k8s 1.28 build the istio
e2e uses.
+kind: Cluster
+apiVersion: kind.x-k8s.io/v1alpha4
+nodes:
+ - role: control-plane
+ image:
kindest/node:v1.28.15@sha256:a7c05c7ae043a0b8c818f5a06188bc2c4098f6cb59ca7d1856df00375d839251