This is an automated email from the ASF dual-hosted git repository. wu-sheng pushed a commit to branch swip-15-banyandb-so11y-rules in repository https://gitbox.apache.org/repos/asf/skywalking.git
commit 6a96f7f19c27da296c494e90a990a06a2b600f41 Author: Wu Sheng <[email protected]> AuthorDate: Wed Jun 10 23:43:27 2026 +0800 SWIP-15: implement BanyanDB self-observability (cluster / container / group model) Rebuild otel-rules/banyandb around the cluster reality: Service = cluster, ServiceInstance = container (pod_name + container_name, with role/tier attributes), Endpoint = group. Add banyandb-endpoint.yaml; redesign service/instance rules to mirror the upstream FODC-proxy Grafana boards. Requires BanyanDB 0.11+. Rewrite the e2e to a no-FODC file-discovery cluster (1 liaison + 1 hot data node); the collector scrapes each node's :2121 directly and injects the identity labels. Operator docs rewritten to the cluster/container/group model. Validated: DSLClassGeneratorTest compiles all rules via the production path; the e2e passes 16/16 against BanyanDB 0.11. Co-Authored-By: Claude Fable 5 <[email protected]> --- docs/en/banyandb/dashboards-banyandb.md | 176 +++++++++++++---- docs/en/changes/changes.md | 8 + .../otel-rules/banyandb/banyandb-endpoint.yaml | 96 +++++++++ .../otel-rules/banyandb/banyandb-instance.yaml | 217 ++++++++++++++------- .../otel-rules/banyandb/banyandb-service.yaml | 113 +++++------ test/e2e-v2/cases/banyandb/banyandb-cases.yaml | 63 ++++-- test/e2e-v2/cases/banyandb/docker-compose.yml | 64 ++++-- test/e2e-v2/cases/banyandb/e2e.yaml | 5 +- .../metrics-has-label-value.yml} | 57 +++--- .../{otel-collector-config.yaml => nodes.yaml} | 39 +--- .../cases/banyandb/otel-collector-config.yaml | 30 ++- 11 files changed, 589 insertions(+), 279 deletions(-) diff --git a/docs/en/banyandb/dashboards-banyandb.md b/docs/en/banyandb/dashboards-banyandb.md index 1d07018f41..33ee16606f 100644 --- a/docs/en/banyandb/dashboards-banyandb.md +++ b/docs/en/banyandb/dashboards-banyandb.md @@ -1,49 +1,141 @@ -# BanyanDB self observability dashboard +# BanyanDB self-observability dashboard -[BanyanDB](https://skywalking.apache.org/docs/skywalking-banyandb/next/readme/), as an observability database, aims to ingest, analyze and store Metrics, Tracing, and Logging data. It's designed to handle observability data generated by **Apache SkyWalking**,it also provides a dashboard to visualize the self-observability metrics. +[Apache SkyWalking BanyanDB](https://skywalking.apache.org/docs/skywalking-banyandb/next/readme/) is the +native storage for SkyWalking. A production deployment is one **cluster** made of many **nodes**, each +running one or more **containers** with a role (`liaison` front door, `data` backend, and the `lifecycle` +tier-migration sidecar), and data is organized into **groups**. SkyWalking models that reality directly +and renders it on the `Layer: BANYANDB` dashboards in the Horizon UI: + +| SkyWalking entity | BanyanDB concept | Identity | +| ----------------- | ---------------- | -------- | +| `Service` | one BanyanDB **cluster** | the `cluster` label | +| `ServiceInstance` | one **container** on a node | `pod_name` + `container_name` (joined by `@`) | +| ↳ attributes | role / tier | `container_name` (`liaison`/`data`/`lifecycle`), `node_type` (`hot`/`warm`/`cold`), `node_role`, `pod_name` | +| `Endpoint` | one **group** (storage partition) | the `group` label (e.g. `sw_metricsMinute`) | + +> **Requires BanyanDB 0.11+.** This feature reads the FODC-proxy cluster-observability metric families +> and the queue / lifecycle metric families that BanyanDB introduced after 0.10. Run a 0.11+ cluster +> with the FODC proxy and the Prometheus metrics provider enabled. ## Data flow -1. [BanyanDB](https://skywalking.apache.org/docs/skywalking-banyandb/next/readme/) collects metrics data internally and exposes a Prometheus http endpoint to retrieve the metrics. -2. OpenTelemetry Collector fetches metrics from BanyanDB and pushes metrics to SkyWalking OAP Server via OpenTelemetry gRPC exporter. -3. The SkyWalking OAP Server parses the expression with [MAL](../concepts-and-designs/mal.md) to filter/calculate/aggregate and store the results. + +1. Each BanyanDB container exposes its metrics; in a cluster the + [FODC proxy](https://skywalking.apache.org/docs/skywalking-banyandb/next/operation/fodc/overview/) + aggregates every container's Prometheus metrics onto a single `/metrics` endpoint (default `:17913`) + and stamps each sample with per-container identity labels (`pod_name`, `container_name`, `node_role`, + and `node_type` on data containers). +2. An OpenTelemetry Collector scrapes the FODC proxy `/metrics` as the single Prometheus target, adds a + static `cluster: <name>` label (the only label SkyWalking must inject), and pushes via the + OpenTelemetry gRPC exporter to the SkyWalking OAP Server. +3. The OAP Server parses the [MAL](../concepts-and-designs/mal.md) rules under `otel-rules/banyandb/` to + filter / calculate / aggregate and store the cluster, instance and group metrics. ## Set up -1. Start [BanyanDB](https://skywalking.apache.org/docs/skywalking-banyandb/next/readme/),supporting both [Standalone Mode](https://skywalking.apache.org/docs/skywalking-banyandb/next/installation/standalone/) and [Cluster Mode](https://skywalking.apache.org/docs/skywalking-banyandb/next/installation/cluster/). -2. Set up [OpenTelemetry Collector ](https://opentelemetry.io/docs/collector/getting-started/#docker). For details on Prometheus Receiver in OpenTelemetry Collector, refer to [here](../../../test/e2e-v2/cases/banyandb/otel-collector-config.yaml). -3. Config SkyWalking [OpenTelemetry receiver](https://skywalking.apache.org/docs/main/next/en/setup/backend/opentelemetry-receiver/). - -## BanyanDB monitoring -Self observability monitoring provides monitoring of the status and resources of the [BanyanDB](https://skywalking.apache.org/docs/skywalking-banyandb/next/readme/) server itself. `banyandb-server` is a `Service` in BanyanDB, and land on the `Layer: BANYANDB`. - -### Self observability metrics - -| Unit | Metric Name | Description | Data Source | -|------|---------------------------------------------------|-------------|-------------| -| o/s | meter_banyandb_write_rate | Write Rate (Operations per Second) | BanyanDB | -| GiB | meter_banyandb_total_memory | Total Memory | BanyanDB | -| GiB | meter_banyandb_disk_usage | Disk Usage | BanyanDB | -| r/s | meter_banyandb_query_rate | Query Rate (Requests per Second) | BanyanDB | -| Count | meter_banyandb_total_cpu | Total CPU Cores | BanyanDB | -| c/m | meter_banyandb_write_and_query_errors_rate | Write and Query Errors Rate(Counts per Minute) | BanyanDB | -| c/s | meter_banyandb_etcd_operation_rate | Etcd Operation Rate(Counts per Second) | BanyanDB | -| Count | meter_banyandb_active_instance | Active Instances | BanyanDB | -| % | meter_banyandb_cpu_usage | CPU Usage Percentage | BanyanDB | -| % | meter_banyandb_rss_memory_usage | RSS Memory Usage Percentage | BanyanDB | -| % | meter_banyandb_disk_usage_all | Disk Usage Percentage | BanyanDB | -| KiB/s | meter_banyandb_network_usage_recv | Network Receive Rate | BanyanDB | -| KiB/s | meter_banyandb_network_usage_sent | Network Send Rate | BanyanDB | -| o/s | meter_banyandb_storage_write_rate | Storage Write Rate (Operations per Second) | BanyanDB | -| s | meter_banyandb_query_latency | Query Latency (s) | BanyanDB | -| Count | meter_banyandb_total_data | Total Data Elements | BanyanDB | -| r/m | meter_banyandb_merge_file_data | Merge File Data Rate(Revolutions per Minute) | BanyanDB | -| s | meter_banyandb_merge_file_latency | Merge File Latency(s) | BanyanDB | -| Count | meter_banyandb_merge_file_partitions | Merge File Partitions | BanyanDB | -| o/s | meter_banyandb_series_write_rate | Series Write Rate (Operations per Second) | BanyanDB | -| o/s | meter_banyandb_series_term_search_rate | Series Term Search Rate (Operations per Second) | BanyanDB | -| Count | meter_banyandb_total_series | Total Series Count | BanyanDB | -| ops | meter_banyandb_stream_write_rate | Stream Write Rate (Operations per Second) | BanyanDB | -| ops | meter_banyandb_term_search_rate | Term Search Rate (Operations per Second) | BanyanDB | -| Count | meter_banyandb_total_document | Total Document Count | BanyanDB | + +1. Run a BanyanDB **0.11+** cluster (liaison + data nodes; data nodes may be tiered hot/warm/cold) with + the **FODC proxy** enabled and the Prometheus metrics provider on (default). Standalone mode is the + degenerate case — one cluster, one node, one `container_name=standalone`. +2. Run an **OpenTelemetry Collector** whose `prometheus` receiver scrapes the FODC proxy `/metrics` + (`:17913`) as the single target and adds a static `cluster: <name>` label, exporting OTLP to OAP. For + a runnable example, see + [the e2e collector config](../../../test/e2e-v2/cases/banyandb/otel-collector-config.yaml). +3. Enable SkyWalking's + [OpenTelemetry receiver](https://skywalking.apache.org/docs/main/next/en/setup/backend/opentelemetry-receiver/). + The `banyandb/*` rules are enabled by default in `enabledOtelMetricsRules`. +4. Open the **Horizon UI** → `BanyanDB` layer. + +## Metrics + +The metric source expressions mirror the upstream BanyanDB Grafana boards, so the SkyWalking dashboards +stay in lockstep with the BanyanDB catalog. The rule files are +`otel-rules/banyandb/banyandb-service.yaml`, `banyandb-instance.yaml` and `banyandb-endpoint.yaml`. + +### Service scope — cluster summary (`meter_banyandb_*`) + +| Unit | Metric | Description | +| ---- | ------ | ----------- | +| w/s | `meter_banyandb_cluster_write_rate` | Cluster write rate across measure/stream/trace | +| r/s | `meter_banyandb_cluster_query_rate` | Cluster query rate | +| c/m | `meter_banyandb_cluster_error_rate` | Cluster error rate (counts/min) | +| Count | `meter_banyandb_reporting_instances` | Live container count by role | +| Count | `meter_banyandb_total_cpu_cores` | Cluster CPU capacity | +| Bytes | `meter_banyandb_total_memory_used` | Cluster memory used | +| Bytes | `meter_banyandb_total_disk_used` | Cluster disk used | + +### Instance scope — per container (`meter_banyandb_instance_*`) + +**All roles** (every container emits these): + +| Unit | Metric | Description | +| ---- | ------ | ----------- | +| s | `node_uptime` | Node uptime | +| Cores | `cpu_usage` | CPU usage | +| Bytes | `rss_memory` | Resident memory | +| percentunit | `system_memory_percent` | System memory used fraction | +| percentunit | `disk_usage_percent` | Disk used fraction (Σused/Σtotal) | +| Bytes | `disk_used_by_path` / `disk_total_by_path` | Disk used / total by mount path | +| percentunit | `disk_used_percent_by_path` | Disk used fraction by mount path | +| Bytes/s | `network_recv` / `network_sent` | Network throughput by interface | +| Count | `goroutines` | Go goroutines | +| s | `gc_pause_avg` | Average GC pause | +| Bytes | `heap_inuse` / `heap_next_gc` | Go heap in-use / next-GC threshold | +| Bytes/s | `alloc_rate` | Go allocation rate | + +**Liaison** (front door; the dashboard gates these on `container_name == 'liaison'`): + +| Unit | Metric | Description | +| ---- | ------ | ----------- | +| r/s | `query_rate_by_service` | Query rate by data-model service | +| c/m | `grpc_error_rate` | gRPC error rate | +| r/s | `non_query_op_rate` | Registry / non-query operation rate | +| w/s | `write_rate` | Write rate seen at the front door | +| ops | `publish_throughput` | Tier-2 publish throughput by operation | +| Bytes/s | `publish_bytes` | Publish bytes | +| s | `publish_latency_p99` | Publish send latency p99 | +| Count | `wqueue_pending` / `wqueue_file_parts` / `wqueue_mem_part` | Write-queue depth | + +**Data** (backend; the dashboard gates these on `container_name == 'data'`): + +| Unit | Metric | Description | +| ---- | ------ | ----------- | +| Count | `total_data` | Total stored data elements | +| o/s | `merge_file_rate` | Merge-loop rate | +| Count | `merge_file_partitions` | Avg parts merged per loop | +| s | `merge_file_latency` | Avg file-merge latency | +| o/s | `series_write_rate` / `series_term_search_rate` | Inverted-index write / term-search rate | +| Count | `total_series` | Inverted-index documents | +| o/s | `stream_tst_write_rate` / `stream_tst_term_search_rate` | Stream tst index write / term-search rate | +| Count | `stream_tst_total_docs` | Stream tst index documents | +| ops | `queue_sub_throughput` | Subscribe-queue throughput by operation | +| s | `queue_sub_latency_p99` | Subscribe-queue latency p99 | +| percent | `retention_measure_disk_usage_percent` / `retention_stream_disk_usage_percent` / `retention_trace_disk_usage_percent` | Retention disk-usage % per scope | + +**Lifecycle** (the tier-migration sidecar on hot/warm data pods; `container_name == 'lifecycle'`): + +| Unit | Metric | Description | +| ---- | ------ | ----------- | +| Count | `lifecycle_cycles` | Cumulative migration cycles | +| s | `lifecycle_last_run` | Seconds since the last migration cycle started | +| Status | `lifecycle_last_run_success` | Last cycle status (1 = OK, 0 = failed) | + +### Endpoint scope — per group (`meter_banyandb_endpoint_*`) + +| Unit | Metric | Description | +| ---- | ------ | ----------- | +| w/s | `write_rate` | Write rate for the group | +| s | `query_latency` | Mean query latency for the group | +| Count | `total_data` | Total stored data elements for the group | +| o/s | `merge_file_rate` | Merge-loop rate for the group | +| s | `merge_file_latency` | Avg file-merge latency for the group | +| Count | `merge_file_partitions` | Avg parts merged per loop for the group | +| o/s | `series_write_rate` | Inverted-index write rate for the group | +| Count | `total_series` | Inverted-index documents for the group | +| ops | `queue_throughput` | Subscribe-queue throughput by operation for the group | +| s | `queue_latency_p99` | Publish-queue latency p99 for the group | +| Bytes/s | `publish_bytes` | Publish bytes for the group | ## Customizations -You can customize your own metrics/expression/dashboard panel.The metrics definition and expression rules are found in `/config/otel-rules/banyandb`.The [BanyanDB](https://skywalking.apache.org/docs/skywalking-banyandb/next/readme/) dashboard panel configurations ship from the SkyWalking Horizon UI bundle (apache/skywalking-horizon-ui); the OAP backend no longer hosts UI dashboard JSONs. + +You can customize your own metrics / expressions. The metric definitions and expression rules are in +`/config/otel-rules/banyandb`. The dashboard panel configurations ship from the SkyWalking Horizon UI +bundle (apache/skywalking-horizon-ui); the OAP backend does not host UI dashboard JSONs. diff --git a/docs/en/changes/changes.md b/docs/en/changes/changes.md index d548d25cd5..f6dd1e414a 100644 --- a/docs/en/changes/changes.md +++ b/docs/en/changes/changes.md @@ -242,6 +242,14 @@ admin-host only" entry above for the public REST retirement. #### OAP Server +* SWIP-15: rebuild BanyanDB self-observability around the cluster / container / group model + (requires BanyanDB 0.11+). `otel-rules/banyandb/` now models a BanyanDB cluster as one `Service` + (`service(['cluster'])`), each container as a `ServiceInstance` keyed on `pod_name` + `container_name` + (with `node_role` / `node_type` / `container_name` / `pod_name` as instance attributes), and each + storage group as an `Endpoint`. New `banyandb-endpoint.yaml`; `banyandb-service.yaml` and + `banyandb-instance.yaml` redesigned to mirror the upstream FODC-proxy Grafana boards. The stale + single-node `host_name` model and the removed `etcd_operation_rate` / `up`-derived `active_instance` + metrics are gone. * Runtime MAL/LAL hot-update rules can declare `layerDefinitions:` to introduce new layers. Ordinals are operator-pinned in the `100_000+` tier; the layer is refcount-tracked and unregistered when the last declaring rule is removed. See diff --git a/oap-server/server-starter/src/main/resources/otel-rules/banyandb/banyandb-endpoint.yaml b/oap-server/server-starter/src/main/resources/otel-rules/banyandb/banyandb-endpoint.yaml new file mode 100644 index 0000000000..ef61c460c8 --- /dev/null +++ b/oap-server/server-starter/src/main/resources/otel-rules/banyandb/banyandb-endpoint.yaml @@ -0,0 +1,96 @@ +# Licensed to the Apache Software Foundation (ASF) under one or more +# contributor license agreements. See the NOTICE file distributed with +# this work for additional information regarding copyright ownership. +# The ASF licenses this file to You under the Apache License, Version 2.0 +# (the "License"); you may not use this file except in compliance with +# the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +# SWIP-15 section 3.3 Endpoint scope: a BanyanDB `group` (storage group, e.g. sw_metricsMinute, +# sw_trace) is modeled as an Endpoint under the cluster Service. The `cluster` label is the +# single static label the OTel collector injects per scrape job (it is NOT on the raw FODC +# wire); `group` is carried natively by every family referenced below. Every metric here is +# aggregated across all cluster nodes per group, so each rule's .sum() collapses the per-node / +# per-seg / per-shard / per-operation / per-remote dimensions down to ['cluster','group'] +# before any rate/histogram/division. MAL arithmetic ('+', '/') inner-joins on exact label +# equality, so every operand is reduced to the identical ['cluster','group'] (or +# ['cluster','group','le'] for histograms) label set first. +# Source expressions mirror the upstream BanyanDB Grafana "Workload" board +# (docs/operation/grafana-fodc-workload.json). +filter: "{ tags -> tags.job_name == 'banyandb-monitoring' }" +expSuffix: endpoint(['cluster'], ['group'], Layer.BANYANDB) +metricPrefix: meter_banyandb_endpoint +metricsRules: + # writes/s for the group, across the three data-model scopes (measure, stream, trace). The + # write counter carries `group` regardless of which role records it, so the by-group roll-up + # is exact. + - name: write_rate + exp: (banyandb_measure_total_written.sum(['cluster', 'group']).rate('PT1M') + banyandb_stream_tst_total_written.sum(['cluster', 'group']).rate('PT1M') + banyandb_trace_tst_total_written.sum(['cluster', 'group']).rate('PT1M')) + + # mean query latency (ms) for the group = sum(latency) / sum(count). liaison_grpc_total_latency + # and _started are BOTH counters (not a histogram), so this is a ratio of cumulative counters, + # not a percentile. Both filtered to method='query' and reduced to ['cluster','group'] + # (collapsing the `service` data-model facet) before the division joins on equal labels. + - name: query_latency + exp: (banyandb_liaison_grpc_total_latency.tagEqual('method', 'query').sum(['cluster', 'group']) / banyandb_liaison_grpc_total_started.tagEqual('method', 'query').sum(['cluster', 'group'])) * 1000 + + # current total stored data elements for the group (gauge). Dimensioned by seg+shard+node_type + # across data nodes; .sum(['cluster','group']) collapses them into one per-group total. + - name: total_data + exp: (banyandb_measure_total_file_elements.sum(['cluster', 'group']) + banyandb_stream_tst_total_file_elements.sum(['cluster', 'group']) + banyandb_trace_tst_total_file_elements.sum(['cluster', 'group'])) + + # merge-loop iterations/min for the group (matches the upstream "Merge File Rate" rotrpm panel, + # which is rate(merge_loop_started) * 60). merge_loop_started carries node_type (NOT a `type` + # label), so no type filter applies here. + - name: merge_file_rate + exp: (banyandb_measure_total_merge_loop_started.sum(['cluster', 'group']).rate('PT1M') + banyandb_stream_tst_total_merge_loop_started.sum(['cluster', 'group']).rate('PT1M') + banyandb_trace_tst_total_merge_loop_started.sum(['cluster', 'group']).rate('PT1M')) * 60 + + # mean file-merge latency (ms) per merge loop for the group. merge_latency carries a `type` + # label (file/hot/mem); type='file' selects on-disk merges and is DATA-only on the wire + # (liaison emits only type='mem'). Divide accumulated merge-seconds by merge loops, both + # type/scope-aligned to ['cluster','group']. Matches the upstream "Merge File Latency" panel. + - name: merge_file_latency + exp: ((banyandb_measure_total_merge_latency.tagEqual('type', 'file').sum(['cluster', 'group']).rate('PT1M') / banyandb_measure_total_merge_loop_started.sum(['cluster', 'group']).rate('PT1M')) + (banyandb_stream_tst_total_merge_latency.tagEqual('type', 'file').sum(['cluster', 'group']).rate('PT1M') / banyandb_stream_tst_total_merge_loop_started.sum(['cluster', 'group']).rate('PT1M')) + (banyandb_trace_tst_total_merge_latency.tagEqual('type', 'file').sum(['cluster', 'group']).rate('PT1 [...] + + # avg parts merged per merge loop on the on-disk merge path for the group (matches the upstream + # "Merge File Partitions" panel = rate(merged_parts{type=file}) / rate(merge_loop_started)). + # merged_parts carries `type`; type='file' is DATA-only (liaison emits only type='mem'). + - name: merge_file_partitions + exp: ((banyandb_measure_total_merged_parts.tagEqual('type', 'file').sum(['cluster', 'group']).rate('PT1M') / banyandb_measure_total_merge_loop_started.sum(['cluster', 'group']).rate('PT1M')) + (banyandb_stream_tst_total_merged_parts.tagEqual('type', 'file').sum(['cluster', 'group']).rate('PT1M') / banyandb_stream_tst_total_merge_loop_started.sum(['cluster', 'group']).rate('PT1M')) + (banyandb_trace_tst_total_merged_parts.tagEqual('type', 'file').sum(['cluster', 'group']).rate('PT1M') [...] + + # inverted-index updates/s for the group. NOTE: *_inverted_index_total_updates is # TYPE=gauge + # though cumulative; rate() over a cumulative gauge yields a per-window delta (updates/s). Stream + # uses two index scopes -- both stream_storage_* and stream_tst_* are summed in. Data-only family. + - name: series_write_rate + exp: (banyandb_measure_inverted_index_total_updates.sum(['cluster', 'group']).rate('PT1M') + banyandb_stream_storage_inverted_index_total_updates.sum(['cluster', 'group']).rate('PT1M') + banyandb_stream_tst_inverted_index_total_updates.sum(['cluster', 'group']).rate('PT1M')) + + # total inverted-index documents (series proxy) for the group (gauge, direct read, no rate). + # Both stream index scopes summed. Dimensioned by seg across data nodes; sum collapses to group. + - name: total_series + exp: (banyandb_measure_inverted_index_total_doc_count.sum(['cluster', 'group']) + banyandb_stream_storage_inverted_index_total_doc_count.sum(['cluster', 'group']) + banyandb_stream_tst_inverted_index_total_doc_count.sum(['cluster', 'group'])) + + # subscribe-side queue throughput (msgs/s) for the group, broken out by operation. queue_sub is + # emitted on BOTH data (operations batch-write/control/file-sync/query) and liaison + # (batch-write only); `operation` is kept in the group-by so the dashboard can split per op. + - name: queue_throughput + exp: banyandb_queue_sub_total_finished.sum(['cluster', 'group', 'operation']).rate('PT1M') + + # publish-side queue p99 latency for the group. queue_pub_total_latency IS a histogram on the + # wire (_bucket carries le); keep le + group + operation in the .sum() group-by, then + # .histogram().histogram_percentile([99]). queue_pub is liaison-only. Precedent: + # oap.yaml / nginx-endpoint.yaml histogram idiom. + - name: queue_latency_p99 + exp: banyandb_queue_pub_total_latency.sum(['le', 'cluster', 'group', 'operation']).histogram().histogram_percentile([99]) + + # publish bytes/s for the group. Wire family is banyandb_queue_pub_sent_bytes -- NO `total` + # infix (unlike queue_pub_total_started/_finished). Liaison-only; sum collapses + # operation/remote_node/remote_role/remote_tier before the rate. + - name: publish_bytes + exp: banyandb_queue_pub_sent_bytes.sum(['cluster', 'group']).rate('PT1M') diff --git a/oap-server/server-starter/src/main/resources/otel-rules/banyandb/banyandb-instance.yaml b/oap-server/server-starter/src/main/resources/otel-rules/banyandb/banyandb-instance.yaml index 21955331f3..c3f728b139 100644 --- a/oap-server/server-starter/src/main/resources/otel-rules/banyandb/banyandb-instance.yaml +++ b/oap-server/server-starter/src/main/resources/otel-rules/banyandb/banyandb-instance.yaml @@ -13,74 +13,157 @@ # See the License for the specific language governing permissions and # limitations under the License. -# This will parse a textual representation of a duration. The formats -# accepted are based on the ISO-8601 duration format {@code PnDTnHnMn.nS} -# with days considered to be exactly 24 hours. -# <p> -# Examples: -# <pre> -# "PT20.345S" -- parses as "20.345 seconds" -# "PT15M" -- parses as "15 minutes" (where a minute is 60 seconds) -# "PT10H" -- parses as "10 hours" (where an hour is 3600 seconds) -# "P2D" -- parses as "2 days" (where a day is 24 hours or 86400 seconds) -# "P2DT3H4M" -- parses as "2 days, 3 hours and 4 minutes" -# "P-6H3M" -- parses as "-6 hours and +3 minutes" -# "-P6H3M" -- parses as "-6 hours and -3 minutes" -# "-P-6H+3M" -- parses as "+6 hours and -3 minutes" -# </pre> +# SWIP-15: BanyanDB self-observability, ServiceInstance scope = one container on a node. +# The instance identity is pod_name + container_name (a data hot/warm pod co-hosts a data and a +# lifecycle container under one pod_name), joined by '@'. role (container_name) and tier (node_type) +# ride as instance attributes via the 6-arg instance() properties closure; node_type Elvis-defaults +# to 'n/a' off data containers (it is absent on liaison samples, present on every ROLE_DATA sample). +# +# Every rule that aggregates keeps ['cluster','pod_name','container_name','node_role','node_type'] in +# its .sum()/.avg()/.max() group-by: SampleFamily.aggregate() drops labels not in the group-by, and +# the properties closure reads them from the post-aggregation sample (SampleFamily.java:810). node_type +# rides on every ROLE_DATA sample (system_*, go_*, process_* included), so a data instance resolves a +# stable tier across all rules; liaison families carry none, so liaison resolves 'n/a' consistently. +# +# Source expressions mirror the upstream BanyanDB Grafana "Nodes" board +# (docs/operation/grafana-fodc-nodes.json) plus the liaison/data rows of the "Workload" board, so the +# SkyWalking instance dashboard stays in lockstep with the upstream catalog. filter: "{ tags -> tags.job_name == 'banyandb-monitoring' }" -expSuffix: tag({tags -> tags.host_name = 'banyandb::' + tags.host_name}).service(['host_name'] , Layer.BANYANDB).instance(['host_name'], ['service_instance_id'], Layer.BANYANDB) -metricPrefix: meter_banyandb +expSuffix: |- + service(['cluster'], Layer.BANYANDB) + .instance(['cluster'], '::', ['pod_name', 'container_name'], '@', Layer.BANYANDB, { tags -> ['node_role': tags.node_role, 'node_type': tags.node_type ?: 'n/a', 'pod_name': tags.pod_name, 'container_name': tags.container_name] }) +metricPrefix: meter_banyandb_instance metricsRules: - - name: instance_write_rate - exp: banyandb_measure_total_written.rate('PT15S')+banyandb_stream_tst_total_written.rate('PT15S') - - name: instance_total_memory - exp: banyandb_system_memory_state.tagEqual('kind','total') - - name: instance_disk_usage - exp: banyandb_system_disk.tagEqual('kind','used').sum(['host_name','service_instance_id']) - - name: instance_query_rate - exp: banyandb_liaison_grpc_total_started.sum(['method','host_name','service_instance_id']) - - name: instance_total_cpu - exp: banyandb_system_cpu_num - - name: instance_write_and_query_errors_rate - exp: banyandb_liaison_grpc_total_err.tagEqual('method','query').sum(['method','host_name','service_instance_id']).rate('PT15S')*60 + banyandb_liaison_grpc_total_stream_msg_sent_err.sum(['host_name','service_instance_id']).rate('PT15S')*60 + banyandb_liaison_grpc_total_stream_msg_received_err.sum(['host_name','service_instance_id']).rate('PT15S')*60 + banyandb_queue_sub_total_msg_sent_err.sum(['host_name','service_instance_id']).rate('PT15S')*60 - - name: instance_etcd_operation_rate - exp: banyandb_liaison_grpc_total_registry_started.sum(['host_name','service_instance_id']).rate('PT15S') + banyandb_liaison_grpc_total_started.sum(['host_name','service_instance_id']).rate('PT15S') - - name: instance_active_instance - exp: up.sum(['host_name','service_instance_id']).downsampling(MIN) - - name: instance_cpu_usage - exp: (((process_cpu_seconds_total.sum(['host_name','service_instance_id']).rate('PT15S') / banyandb_system_cpu_num.sum(['host_name','service_instance_id']))).max(['host_name','service_instance_id']))*1000 - - name: instance_rss_memory_usage - exp: ((process_resident_memory_bytes.sum(['host_name','service_instance_id']).downsampling(MAX) / banyandb_system_memory_state.tagEqual('kind','total').sum(['host_name','service_instance_id'])).max(['host_name','service_instance_id']))*1000 - - name: instance_disk_usage_all - exp: ((banyandb_system_disk.tagEqual('kind','used').sum(['host_name','service_instance_id']) / banyandb_system_memory_state.tagEqual('kind','total').sum(['host_name','service_instance_id'])).max(['host_name','service_instance_id']))*1000 - - name: instance_network_usage_recv - exp: banyandb_system_net_state.tagEqual('kind','bytes_recv').sum(['host_name','service_instance_id']).rate('PT15S') - - name: instance_network_usage_sent - exp: banyandb_system_net_state.tagEqual('kind','bytes_sent').sum(['host_name','service_instance_id']).rate('PT15S') - - name: instance_storage_write_rate - exp: banyandb_measure_total_written.sum(['group','host_name','service_instance_id']).rate('PT15S')*1000 - - name: instance_query_latency - exp: (banyandb_liaison_grpc_total_latency.tagEqual('method','query').sum(['group','host_name','service_instance_id']).rate('PT15S') / banyandb_liaison_grpc_total_started.tagEqual('method','query').sum(['group','host_name','service_instance_id']).rate('PT15S'))*1000 - - name: instance_total_data - exp: banyandb_measure_total_file_elements.sum(['group','host_name','service_instance_id']) - - name: instance_merge_file_data - exp: banyandb_measure_total_merge_loop_started.sum(['group','host_name','service_instance_id']).rate('PT15S') * 60 *1000 - - name: instance_merge_file_latency - exp: (banyandb_measure_total_merge_latency.tagEqual('type','file').sum(['group','host_name','service_instance_id']).rate('PT15S') / banyandb_measure_total_merge_loop_started.sum(['group','host_name','service_instance_id']).rate('PT15S'))*1000 - - name: instance_merge_file_partitions - exp: (banyandb_measure_total_merged_parts.tagEqual('type','file').sum(['group','host_name','service_instance_id']).rate('PT15S') / banyandb_measure_total_merge_loop_started.sum(['group','host_name','service_instance_id']).rate('PT15S'))*1000 - - name: instance_series_write_rate - exp: (banyandb_measure_inverted_index_total_updates.sum(['group','host_name','service_instance_id']).rate('PT15S'))*1000 - - name: instance_series_term_search_rate - exp: banyandb_stream_storage_inverted_index_total_term_searchers_started.sum(['group','host_name','service_instance_id']).rate('PT15S') - - name: instance_total_series - exp: banyandb_measure_inverted_index_total_doc_count.sum(['group','host_name','service_instance_id']) - - name: instance_stream_write_rate - exp: banyandb_stream_tst_inverted_index_total_updates.sum(['group','host_name','service_instance_id']).rate('PT15S') - - name: instance_term_search_rate - exp: banyandb_stream_tst_inverted_index_total_term_searchers_started.sum(['group','host_name','service_instance_id']).rate('PT15S')* 1000 - - name: instance_total_document - exp: banyandb_stream_tst_inverted_index_total_doc_count.sum(['group','host_name','service_instance_id']) + # ---- All roles: Resources / Disk by Path / Go Runtime (every container emits these) ---- + # node uptime (s). Raw gauge; ABSENT on lifecycle containers (their binary runs the metric service + # without the system collector), so the lifecycle instance shows no uptime. + - name: node_uptime + exp: banyandb_system_up_time + # CPU usage (cores). process_* rides on every container including lifecycle. + - name: cpu_usage + exp: process_cpu_seconds_total.sum(['cluster','pod_name','container_name','node_role','node_type']).rate('PT15S') + # resident memory (bytes). Raw gauge, present on all containers. + - name: rss_memory + exp: process_resident_memory_bytes + # system memory used %. kind='used_percent' is emitted directly (a 0-1 fraction; source divides by 100). + - name: system_memory_percent + exp: banyandb_system_memory_state.tagEqual('kind','used_percent') + # disk used % = Σused / Σtotal across the node's data paths (matches the Grafana "Disk Usage %" panel). + - name: disk_usage_percent + exp: banyandb_system_disk.tagEqual('kind','used').sum(['cluster','pod_name','container_name','node_role','node_type']) / banyandb_system_disk.tagEqual('kind','total').sum(['cluster','pod_name','container_name','node_role','node_type']) + # disk used / total / used% broken out per mount path. + - name: disk_used_by_path + exp: banyandb_system_disk.tagEqual('kind','used').sum(['cluster','pod_name','container_name','node_role','node_type','path']) + - name: disk_total_by_path + exp: banyandb_system_disk.tagEqual('kind','total').sum(['cluster','pod_name','container_name','node_role','node_type','path']) + - name: disk_used_percent_by_path + exp: banyandb_system_disk.tagEqual('kind','used').sum(['cluster','pod_name','container_name','node_role','node_type','path']) / banyandb_system_disk.tagEqual('kind','total').sum(['cluster','pod_name','container_name','node_role','node_type','path']) + # network throughput (bytes/s) by interface name. + - name: network_recv + exp: banyandb_system_net_state.tagEqual('kind','bytes_recv').sum(['cluster','pod_name','container_name','node_role','node_type','name']).rate('PT15S') + - name: network_sent + exp: banyandb_system_net_state.tagEqual('kind','bytes_sent').sum(['cluster','pod_name','container_name','node_role','node_type','name']).rate('PT15S') + # Go runtime. + - name: goroutines + exp: go_goroutines + # average GC pause (s) = rate(Σpause) / rate(Σcount). go_gc_duration_seconds is a summary (no buckets), + # so this ratio of _sum/_count is the only valid average — do not apply histogram_percentile to it. + - name: gc_pause_avg + exp: go_gc_duration_seconds_sum.sum(['cluster','pod_name','container_name','node_role','node_type']).rate('PT15S') / go_gc_duration_seconds_count.sum(['cluster','pod_name','container_name','node_role','node_type']).rate('PT15S') + - name: heap_inuse + exp: go_memstats_heap_inuse_bytes + - name: heap_next_gc + exp: go_memstats_next_gc_bytes + - name: alloc_rate + exp: go_memstats_alloc_bytes_total.sum(['cluster','pod_name','container_name','node_role','node_type']).rate('PT15S') + + # ---- Liaison only (front door; the dashboard gates these on container_name == liaison) ---- + # query rate (req/s) by data-model service (measure/stream/trace/property). method literal is "query". + - name: query_rate_by_service + exp: banyandb_liaison_grpc_total_started.tagEqual('method','query').sum(['cluster','pod_name','container_name','node_role','node_type','service']).rate('PT15S') + # gRPC errors/min. Three liaison-side error families (mirrors the Grafana "gRPC Error Rate" panel, + # which sums total_err + registry_err + stream_msg_received_err). All lazily registered -> empty on a + # healthy cluster; each pre-aggregated to the same label set before '+'. + - name: grpc_error_rate + exp: (banyandb_liaison_grpc_total_err.sum(['cluster','pod_name','container_name','node_role','node_type']).rate('PT15S') + banyandb_liaison_grpc_total_registry_err.sum(['cluster','pod_name','container_name','node_role','node_type']).rate('PT15S') + banyandb_liaison_grpc_total_stream_msg_received_err.sum(['cluster','pod_name','container_name','node_role','node_type']).rate('PT15S')) * 60 + # non-query operation rate (req/s): registry ops + any non-query unary call. total_started is + # query-only on the wire, so tagNotEqual('method','query') is empty today; registry_started carries it. + - name: non_query_op_rate + exp: banyandb_liaison_grpc_total_registry_started.sum(['cluster','pod_name','container_name','node_role','node_type']).rate('PT15S') + banyandb_liaison_grpc_total_started.tagNotEqual('method','query').sum(['cluster','pod_name','container_name','node_role','node_type']).rate('PT15S') + # write rate (writes/s) seen at the liaison front door. group label dropped (instance-level total). + - name: write_rate + exp: banyandb_measure_total_written.sum(['cluster','pod_name','container_name','node_role','node_type']).rate('PT15S') + banyandb_stream_tst_total_written.sum(['cluster','pod_name','container_name','node_role','node_type']).rate('PT15S') + banyandb_trace_tst_total_written.sum(['cluster','pod_name','container_name','node_role','node_type']).rate('PT15S') + # tier-2 publish pipeline (liaison -> data): throughput by operation, bytes/s, and p99 send latency. + - name: publish_throughput + exp: banyandb_queue_pub_total_finished.sum(['cluster','pod_name','container_name','node_role','node_type','operation']).rate('PT15S') + - name: publish_bytes + exp: banyandb_queue_pub_sent_bytes.sum(['cluster','pod_name','container_name','node_role','node_type']).rate('PT15S') + - name: publish_latency_p99 + exp: banyandb_queue_pub_total_latency.sum(['cluster','pod_name','container_name','node_role','node_type','operation','le']).histogram().histogram_percentile([99]) + # write-queue (wqueue) depth: pending records, on-disk file parts, in-memory parts. On the liaison + # these reflect the write buffer; the same families on data containers reflect storage parts (the + # dashboard gates on container_name). Gauges, summed to the instance. + - name: wqueue_pending + exp: banyandb_measure_pending_data_count.sum(['cluster','pod_name','container_name','node_role','node_type']) + banyandb_stream_tst_pending_data_count.sum(['cluster','pod_name','container_name','node_role','node_type']) + banyandb_trace_tst_pending_data_count.sum(['cluster','pod_name','container_name','node_role','node_type']) + - name: wqueue_file_parts + exp: banyandb_measure_total_file_parts.sum(['cluster','pod_name','container_name','node_role','node_type']) + banyandb_stream_tst_total_file_parts.sum(['cluster','pod_name','container_name','node_role','node_type']) + banyandb_trace_tst_total_file_parts.sum(['cluster','pod_name','container_name','node_role','node_type']) + - name: wqueue_mem_part + exp: banyandb_measure_total_mem_part.sum(['cluster','pod_name','container_name','node_role','node_type']) + banyandb_stream_tst_total_mem_part.sum(['cluster','pod_name','container_name','node_role','node_type']) + banyandb_trace_tst_total_mem_part.sum(['cluster','pod_name','container_name','node_role','node_type']) + # ---- Data only (backend; the dashboard gates these on container_name == data) ---- + # total stored data elements (gauge). + - name: total_data + exp: banyandb_measure_total_file_elements.sum(['cluster','pod_name','container_name','node_role','node_type']) + banyandb_stream_tst_total_file_elements.sum(['cluster','pod_name','container_name','node_role','node_type']) + banyandb_trace_tst_total_file_elements.sum(['cluster','pod_name','container_name','node_role','node_type']) + # merge-loop iterations/s. + - name: merge_file_rate + exp: banyandb_measure_total_merge_loop_started.sum(['cluster','pod_name','container_name','node_role','node_type']).rate('PT15S') + banyandb_stream_tst_total_merge_loop_started.sum(['cluster','pod_name','container_name','node_role','node_type']).rate('PT15S') + banyandb_trace_tst_total_merge_loop_started.sum(['cluster','pod_name','container_name','node_role','node_type']).rate('PT15S') + # avg parts merged per merge loop on the file path (matches Grafana = rate(merged_parts{type=file}) / + # rate(merge_loop_started)). type='file' is data-only on the wire (liaison emits only type='mem'). + - name: merge_file_partitions + exp: (banyandb_measure_total_merged_parts.tagEqual('type','file').sum(['cluster','pod_name','container_name','node_role','node_type']).rate('PT15S') / banyandb_measure_total_merge_loop_started.sum(['cluster','pod_name','container_name','node_role','node_type']).rate('PT15S')) + (banyandb_stream_tst_total_merged_parts.tagEqual('type','file').sum(['cluster','pod_name','container_name','node_role','node_type']).rate('PT15S') / banyandb_stream_tst_total_merge_loop_started.sum(['cluster', [...] + # avg file-merge latency (ms) per merge loop. + - name: merge_file_latency + exp: ((banyandb_measure_total_merge_latency.tagEqual('type','file').sum(['cluster','pod_name','container_name','node_role','node_type']).rate('PT15S') / banyandb_measure_total_merge_loop_started.sum(['cluster','pod_name','container_name','node_role','node_type']).rate('PT15S')) + (banyandb_stream_tst_total_merge_latency.tagEqual('type','file').sum(['cluster','pod_name','container_name','node_role','node_type']).rate('PT15S') / banyandb_stream_tst_total_merge_loop_started.sum(['cluste [...] + # inverted-index (series) write rate / term-search rate / total docs. *_inverted_index_total_* are + # # TYPE=gauge but cumulative, so rate() yields a per-window delta. Stream's series index is the + # storage scope (stream_storage_*); the tst scope is reported separately below. + - name: series_write_rate + exp: banyandb_measure_inverted_index_total_updates.sum(['cluster','pod_name','container_name','node_role','node_type']).rate('PT15S') + banyandb_stream_storage_inverted_index_total_updates.sum(['cluster','pod_name','container_name','node_role','node_type']).rate('PT15S') + - name: series_term_search_rate + exp: banyandb_measure_inverted_index_total_term_searchers_started.sum(['cluster','pod_name','container_name','node_role','node_type']).rate('PT15S') + banyandb_stream_storage_inverted_index_total_term_searchers_started.sum(['cluster','pod_name','container_name','node_role','node_type']).rate('PT15S') + - name: total_series + exp: banyandb_measure_inverted_index_total_doc_count.sum(['cluster','pod_name','container_name','node_role','node_type']) + banyandb_stream_storage_inverted_index_total_doc_count.sum(['cluster','pod_name','container_name','node_role','node_type']) + # stream time-series-table (tst) index, distinct from the stream series (storage) index above. + - name: stream_tst_write_rate + exp: banyandb_stream_tst_inverted_index_total_updates.sum(['cluster','pod_name','container_name','node_role','node_type']).rate('PT15S') + - name: stream_tst_term_search_rate + exp: banyandb_stream_tst_inverted_index_total_term_searchers_started.sum(['cluster','pod_name','container_name','node_role','node_type']).rate('PT15S') + - name: stream_tst_total_docs + exp: banyandb_stream_tst_inverted_index_total_doc_count.sum(['cluster','pod_name','container_name','node_role','node_type']) + # subscribe-side queue (data receives from liaison): throughput by operation + p99 latency. + - name: queue_sub_throughput + exp: banyandb_queue_sub_total_finished.sum(['cluster','pod_name','container_name','node_role','node_type','operation']).rate('PT15S') + - name: queue_sub_latency_p99 + exp: banyandb_queue_sub_total_latency.sum(['cluster','pod_name','container_name','node_role','node_type','operation','le']).histogram().histogram_percentile([99]) + # retention disk-usage % per data-model scope (0-100 gauge). Kept per scope rather than summed (a sum + # of three percentages is meaningless). Not in the upstream Grafana boards; a SkyWalking addition. + - name: retention_measure_disk_usage_percent + exp: banyandb_storage_retention_measure_disk_usage_percent + - name: retention_stream_disk_usage_percent + exp: banyandb_storage_retention_stream_disk_usage_percent + - name: retention_trace_disk_usage_percent + exp: banyandb_storage_retention_trace_disk_usage_percent + # ---- Lifecycle only (the tier-migration sidecar on hot/warm data pods; container_name == lifecycle) ---- + # cumulative migration cycles. Compiled into build #1166 (BanyanDB #1164) but lazily registered: + # emits no series until the first migration cycle fires. Dashboard renders absent-as-0. + - name: lifecycle_cycles + exp: banyandb_lifecycle_cycles_total + # seconds since the last migration cycle started = now - epoch. time() is the MAL ingest-time scalar + # (MQE has no current-time function), computed when the rule runs. BUILD-GATED: the source gauge is + # BanyanDB #1167+, absent on build #1166 -> no series until the cluster runs >= #1167 AND a cycle ends. + - name: lifecycle_last_run + exp: (time() - banyandb_lifecycle_last_run_timestamp_seconds.max(['cluster','pod_name','container_name','node_role','node_type'])) + # last cycle status (1 = OK, 0 = failed). Same #1167 build gate as lifecycle_last_run. + - name: lifecycle_last_run_success + exp: banyandb_lifecycle_last_run_success diff --git a/oap-server/server-starter/src/main/resources/otel-rules/banyandb/banyandb-service.yaml b/oap-server/server-starter/src/main/resources/otel-rules/banyandb/banyandb-service.yaml index 566f893cc4..97c6cac8f6 100644 --- a/oap-server/server-starter/src/main/resources/otel-rules/banyandb/banyandb-service.yaml +++ b/oap-server/server-starter/src/main/resources/otel-rules/banyandb/banyandb-service.yaml @@ -13,74 +13,51 @@ # See the License for the specific language governing permissions and # limitations under the License. -# This will parse a textual representation of a duration. The formats -# accepted are based on the ISO-8601 duration format {@code PnDTnHnMn.nS} -# with days considered to be exactly 24 hours. -# <p> -# Examples: -# <pre> -# "PT20.345S" -- parses as "20.345 seconds" -# "PT15M" -- parses as "15 minutes" (where a minute is 60 seconds) -# "PT10H" -- parses as "10 hours" (where an hour is 3600 seconds) -# "P2D" -- parses as "2 days" (where a day is 24 hours or 86400 seconds) -# "P2DT3H4M" -- parses as "2 days, 3 hours and 4 minutes" -# "P-6H3M" -- parses as "-6 hours and +3 minutes" -# "-P6H3M" -- parses as "-6 hours and -3 minutes" -# "-P-6H+3M" -- parses as "+6 hours and -3 minutes" -# </pre> +# SWIP-15: BanyanDB self-observability, Service scope = one BanyanDB cluster. +# The FODC proxy is the single scrape target; the collector injects one static label +# `cluster` (the only label SkyWalking must add). Every BanyanDB-native family carries +# the `banyandb_` prefix; only the Go-runtime / process exporter families (go_* / process_*) +# are bare. All cluster KPIs collapse to the single `cluster` series via `.sum(['cluster'])`, +# which is also what makes the heterogeneous error families joinable by MAL `+` +# (MAL arithmetic inner-joins on exact label equality). +# Source expressions mirror the upstream BanyanDB Grafana boards +# (docs/operation/grafana-fodc-workload.json) so the SkyWalking dashboards stay in lockstep. filter: "{ tags -> tags.job_name == 'banyandb-monitoring' }" -expSuffix: tag({tags -> tags.host_name = 'banyandb::' + tags.host_name}).service(['host_name'] , Layer.BANYANDB) +expSuffix: service(['cluster'], Layer.BANYANDB) metricPrefix: meter_banyandb metricsRules: - - name: write_rate - exp: (banyandb_measure_total_written.sum(['host_name','service_instance_id']).rate('PT15S') + banyandb_stream_tst_total_written.sum(['host_name','service_instance_id']).rate('PT15S')) - - name: total_memory - exp: banyandb_system_memory_state.tagEqual('kind','total').sum(['host_name']) - - name: disk_usage - exp: banyandb_system_disk.tagEqual('kind','used').sum(['host_name','service_instance_id']) - - name: query_rate - exp: banyandb_liaison_grpc_total_started.sum(['method','host_name','service_instance_id']) - - name: total_cpu - exp: banyandb_system_cpu_num.sum(['method','host_name','service_instance_id']) - - name: write_and_query_errors_rate - exp: banyandb_liaison_grpc_total_err.tagEqual('method','query').sum(['method','host_name','service_instance_id']).rate('PT15S')*60 + banyandb_liaison_grpc_total_stream_msg_sent_err.sum(['host_name','service_instance_id']).rate('PT15S')*60 + banyandb_liaison_grpc_total_stream_msg_received_err.sum(['host_name','service_instance_id']).rate('PT15S')*60 + banyandb_queue_sub_total_msg_sent_err.sum(['host_name','service_instance_id']).rate('PT15S')*60 - - name: etcd_operation_rate - exp: banyandb_liaison_grpc_total_registry_started.sum(['host_name','service_instance_id']).rate('PT15S') + banyandb_liaison_grpc_total_started.sum(['host_name','service_instance_id']).rate('PT15S') - - name: active_instance - exp: up.sum(['host_name','service_instance_id']).downsampling(MIN) - - name: cpu_usage - exp: (((process_cpu_seconds_total.sum(['host_name','service_instance_id']).rate('PT15S') / banyandb_system_cpu_num.sum(['host_name','service_instance_id']))).max(['host_name','service_instance_id']))*1000 - - name: rss_memory_usage - exp: ((process_resident_memory_bytes.sum(['host_name','service_instance_id']).downsampling(MAX) / banyandb_system_memory_state.tagEqual('kind','total').sum(['host_name','service_instance_id'])).max(['host_name','service_instance_id']))*1000 - - name: disk_usage_all - exp: ((banyandb_system_disk.tagEqual('kind','used').sum(['host_name','service_instance_id']) / banyandb_system_memory_state.tagEqual('kind','total').sum(['host_name','service_instance_id'])).max(['host_name','service_instance_id']))*1000 - - name: network_usage_recv - exp: banyandb_system_net_state.tagEqual('kind','bytes_recv').sum(['host_name','service_instance_id']).rate('PT15S') - - name: network_usage_sent - exp: banyandb_system_net_state.tagEqual('kind','bytes_sent').sum(['host_name','service_instance_id']).rate('PT15S') - - name: storage_write_rate - exp: banyandb_measure_total_written.sum(['group','host_name','service_instance_id']).rate('PT15S')*1000 - - name: query_latency - exp: (banyandb_liaison_grpc_total_latency.tagEqual('method','query').sum(['group','host_name','service_instance_id']).rate('PT15S') / banyandb_liaison_grpc_total_started.tagEqual('method','query').sum(['group','host_name','service_instance_id']).rate('PT15S'))*1000 - - name: total_data - exp: banyandb_measure_total_file_elements.sum(['group','host_name','service_instance_id']) - - name: merge_file_data - exp: banyandb_measure_total_merge_loop_started.sum(['group','host_name','service_instance_id']).rate('PT15S') * 60 *1000 - - name: merge_file_latency - exp: (banyandb_measure_total_merge_latency.tagEqual('type','file').sum(['group','host_name','service_instance_id']).rate('PT15S') / banyandb_measure_total_merge_loop_started.sum(['group','host_name','service_instance_id']).rate('PT15S'))*1000 - - name: merge_file_partitions - exp: (banyandb_measure_total_merged_parts.tagEqual('type','file').sum(['group','host_name','service_instance_id']).rate('PT15S') / banyandb_measure_total_merge_loop_started.sum(['group','host_name','service_instance_id']).rate('PT15S'))*1000 - - name: series_write_rate - exp: (banyandb_measure_inverted_index_total_updates.sum(['group','host_name','service_instance_id']).rate('PT15S'))*1000 - - name: series_term_search_rate - exp: banyandb_stream_storage_inverted_index_total_term_searchers_started.sum(['group','host_name','service_instance_id']).rate('PT15S') - - name: total_series - exp: banyandb_measure_inverted_index_total_doc_count.sum(['group','host_name','service_instance_id']) - - name: stream_write_rate - exp: banyandb_stream_tst_inverted_index_total_updates.sum(['group','host_name','service_instance_id']).rate('PT15S') - - name: term_search_rate - exp: banyandb_stream_tst_inverted_index_total_term_searchers_started.sum(['group','host_name','service_instance_id']).rate('PT15S')* 1000 - - name: total_document - exp: banyandb_stream_tst_inverted_index_total_doc_count.sum(['group','host_name','service_instance_id']) - - + # cluster writes/s across the three data-model scopes (measure, stream, trace). Each scope's + # write counter is collapsed to one per-cluster series before `+`. + - name: cluster_write_rate + exp: (banyandb_measure_total_written.sum(['cluster']).rate('PT15S') + banyandb_stream_tst_total_written.sum(['cluster']).rate('PT15S') + banyandb_trace_tst_total_written.sum(['cluster']).rate('PT15S')) + # cluster queries/s. `service` on this family is BanyanDB's data-model facet + # (measure/stream/trace/property), not a SkyWalking service; method literal is "query". + - name: cluster_query_rate + exp: banyandb_liaison_grpc_total_started.tagEqual('method','query').sum(['cluster']).rate('PT15S') + # cluster errors/min. The seven liaison-side error families mirror the upstream Grafana + # "Error Rate" stat (grafana-fodc-workload.json). Each is pre-aggregated to ['cluster'] + # BEFORE `+` because their wire label sets differ (stream_msg_received_err carries + # group/method/service, registry_err carries method/service, sync_loop_err carries group) + # and MAL `+` joins on exact label equality. On a healthy cluster most of these are lazily + # registered and emit no series; MAL treats an empty operand as the additive identity, so the + # sum emits from whatever has fired and renders absent-as-0 when nothing has. + - name: cluster_error_rate + exp: (banyandb_liaison_grpc_total_err.sum(['cluster']).rate('PT15S') + banyandb_liaison_grpc_total_registry_err.sum(['cluster']).rate('PT15S') + banyandb_liaison_grpc_total_stream_msg_received_err.sum(['cluster']).rate('PT15S') + banyandb_queue_pub_total_err.sum(['cluster']).rate('PT15S') + banyandb_measure_total_sync_loop_err.sum(['cluster']).rate('PT15S') + banyandb_stream_tst_total_sync_loop_err.sum(['cluster']).rate('PT15S') + banyandb_trace_tst_total_sync_loop_err.sum(['cluster' [...] + # live container count by role. count(['cluster','container_name','pod_name']) groups by all + # three then re-groups excluding the last key (pod_name), yielding one sample per + # (cluster, container_name) whose value = distinct pod_name count -> data=N, liaison=M. + # Mirrors the upstream "Nodes by Role" stat (count(banyandb_system_up_time) by container_name). + # CAVEAT: banyandb_system_up_time has NO lifecycle series (the lifecycle sidecar runs its + # metric service without the system collector), so this never emits a lifecycle row. + - name: reporting_instances + exp: banyandb_system_up_time.count(['cluster','container_name','pod_name']) + # cluster CPU capacity = sum of per-container visible core counts (no lifecycle series). + - name: total_cpu_cores + exp: banyandb_system_cpu_num.sum(['cluster']) + # cluster memory used (bytes). kind='used' is a real wire value (kind in total/used/used_percent). + - name: total_memory_used + exp: banyandb_system_memory_state.tagEqual('kind','used').sum(['cluster']) + # cluster disk used (bytes). system_disk carries a `path` label with multiple data roots; + # .sum(['cluster']) collapses all paths into one cluster total. + - name: total_disk_used + exp: banyandb_system_disk.tagEqual('kind','used').sum(['cluster']) diff --git a/test/e2e-v2/cases/banyandb/banyandb-cases.yaml b/test/e2e-v2/cases/banyandb/banyandb-cases.yaml index dfc901f490..83367a20c9 100644 --- a/test/e2e-v2/cases/banyandb/banyandb-cases.yaml +++ b/test/e2e-v2/cases/banyandb/banyandb-cases.yaml @@ -13,30 +13,59 @@ # See the License for the specific language governing permissions and # limitations under the License. -# This file contains BanyanDB instance metrics queries, referencing -# oap-server/server-starter/src/main/resources/otel-rules/banyandb.yaml - +# SWIP-15 BanyanDB self-observability metrics, the cluster / container / group model. References +# oap-server/server-starter/src/main/resources/otel-rules/banyandb/{banyandb-service,banyandb-instance,banyandb-endpoint}.yaml +# Entity identities come from the collector's injected labels (otel-collector-config.yaml): +# Service = cluster -> e2e-banyandb +# Instance = pod_name '@' container_name -> banyandb-liaison-0@liaison, banyandb-data-hot-0@data +# Endpoint = group -> sw_metricsMinute (an OAP self-telemetry group) +# Expected templates: +# metrics-has-value.yml — single unlabeled series (service/endpoint metrics summed to the entity key) +# metrics-has-label-value.yml — labeled series. Instance metrics retain node_role/node_type labels +# (kept in the .sum() group-by so the instance properties closure resolves +# role/tier); reporting_instances is labeled by container_name; queue +# metrics are labeled by operation. +# This minimal cluster (1 liaison + 1 hot data, no FODC) intentionally does NOT cover: lifecycle +# (no migration sidecar / cycle), warm & cold tiers, error counters (lazily registered -> empty on a +# healthy cluster), or multi-node aggregation. cases: - - query: swctl --display yaml --base-url=http://${oap_host}:${oap_12800}/graphql metrics exec --expression=meter_banyandb_total_memory --service-name=banyandb::server - expected: expected/metrics-has-value.yml - - query: swctl --display yaml --base-url=http://${oap_host}:${oap_12800}/graphql metrics exec --expression=meter_banyandb_instance_write_rate --service-name=banyandb::server --instance-name=banyandb:2121 - expected: expected/metrics-has-value.yml - - query: swctl --display yaml --base-url=http://${oap_host}:${oap_12800}/graphql metrics exec --expression=meter_banyandb_instance_total_memory --service-name=banyandb::server --instance-name=banyandb:2121 + # ---- Service scope (cluster KPIs) ---- + - query: swctl --display yaml --base-url=http://${oap_host}:${oap_12800}/graphql metrics exec --expression=meter_banyandb_cluster_write_rate --service-name=e2e-banyandb expected: expected/metrics-has-value.yml - - query: swctl --display yaml --base-url=http://${oap_host}:${oap_12800}/graphql metrics exec --expression=meter_banyandb_instance_total_cpu --service-name=banyandb::server --instance-name=banyandb:2121 + - query: swctl --display yaml --base-url=http://${oap_host}:${oap_12800}/graphql metrics exec --expression=meter_banyandb_cluster_query_rate --service-name=e2e-banyandb expected: expected/metrics-has-value.yml - - query: swctl --display yaml --base-url=http://${oap_host}:${oap_12800}/graphql metrics exec --expression=meter_banyandb_instance_etcd_operation_rate --service-name=banyandb::server --instance-name=banyandb:2121 + - query: swctl --display yaml --base-url=http://${oap_host}:${oap_12800}/graphql metrics exec --expression=meter_banyandb_total_cpu_cores --service-name=e2e-banyandb expected: expected/metrics-has-value.yml - - query: swctl --display yaml --base-url=http://${oap_host}:${oap_12800}/graphql metrics exec --expression=meter_banyandb_instance_active_instance --service-name=banyandb::server --instance-name=banyandb:2121 + - query: swctl --display yaml --base-url=http://${oap_host}:${oap_12800}/graphql metrics exec --expression=meter_banyandb_total_memory_used --service-name=e2e-banyandb expected: expected/metrics-has-value.yml - - query: swctl --display yaml --base-url=http://${oap_host}:${oap_12800}/graphql metrics exec --expression=meter_banyandb_instance_cpu_usage --service-name=banyandb::server --instance-name=banyandb:2121 + - query: swctl --display yaml --base-url=http://${oap_host}:${oap_12800}/graphql metrics exec --expression=meter_banyandb_total_disk_used --service-name=e2e-banyandb expected: expected/metrics-has-value.yml - - query: swctl --display yaml --base-url=http://${oap_host}:${oap_12800}/graphql metrics exec --expression=meter_banyandb_instance_rss_memory_usage --service-name=banyandb::server --instance-name=banyandb:2121 + # live container count by role (labeled by container_name) + - query: swctl --display yaml --base-url=http://${oap_host}:${oap_12800}/graphql metrics exec --expression=meter_banyandb_reporting_instances --service-name=e2e-banyandb + expected: expected/metrics-has-label-value.yml + + # ---- Instance scope (labeled by node_role/node_type, and operation where applicable) ---- + # data node (banyandb-data-hot-0@data) + - query: swctl --display yaml --base-url=http://${oap_host}:${oap_12800}/graphql metrics exec --expression=meter_banyandb_instance_node_uptime --service-name=e2e-banyandb --instance-name=banyandb-data-hot-0@data expected: expected/metrics-has-value.yml - - query: swctl --display yaml --base-url=http://${oap_host}:${oap_12800}/graphql metrics exec --expression=meter_banyandb_instance_disk_usage_all --service-name=banyandb::server --instance-name=banyandb:2121 + - query: swctl --display yaml --base-url=http://${oap_host}:${oap_12800}/graphql metrics exec --expression=meter_banyandb_instance_cpu_usage --service-name=e2e-banyandb --instance-name=banyandb-data-hot-0@data + expected: expected/metrics-has-label-value.yml + - query: swctl --display yaml --base-url=http://${oap_host}:${oap_12800}/graphql metrics exec --expression=meter_banyandb_instance_total_data --service-name=e2e-banyandb --instance-name=banyandb-data-hot-0@data + expected: expected/metrics-has-label-value.yml + - query: swctl --display yaml --base-url=http://${oap_host}:${oap_12800}/graphql metrics exec --expression=meter_banyandb_instance_queue_sub_throughput --service-name=e2e-banyandb --instance-name=banyandb-data-hot-0@data + expected: expected/metrics-has-label-value.yml + # liaison node (banyandb-liaison-0@liaison) + - query: swctl --display yaml --base-url=http://${oap_host}:${oap_12800}/graphql metrics exec --expression=meter_banyandb_instance_node_uptime --service-name=e2e-banyandb --instance-name=banyandb-liaison-0@liaison expected: expected/metrics-has-value.yml - - query: swctl --display yaml --base-url=http://${oap_host}:${oap_12800}/graphql metrics exec --expression=meter_banyandb_instance_network_usage_recv --service-name=banyandb::server --instance-name=banyandb:2121 + - query: swctl --display yaml --base-url=http://${oap_host}:${oap_12800}/graphql metrics exec --expression=meter_banyandb_instance_write_rate --service-name=e2e-banyandb --instance-name=banyandb-liaison-0@liaison + expected: expected/metrics-has-label-value.yml + - query: swctl --display yaml --base-url=http://${oap_host}:${oap_12800}/graphql metrics exec --expression=meter_banyandb_instance_publish_throughput --service-name=e2e-banyandb --instance-name=banyandb-liaison-0@liaison + expected: expected/metrics-has-label-value.yml + + # ---- Endpoint scope (storage group sw_metricsMinute) ---- + - query: swctl --display yaml --base-url=http://${oap_host}:${oap_12800}/graphql metrics exec --expression=meter_banyandb_endpoint_write_rate --service-name=e2e-banyandb --endpoint-name=sw_metricsMinute expected: expected/metrics-has-value.yml - - query: swctl --display yaml --base-url=http://${oap_host}:${oap_12800}/graphql metrics exec --expression=meter_banyandb_instance_network_usage_sent --service-name=banyandb::server --instance-name=banyandb:2121 + - query: swctl --display yaml --base-url=http://${oap_host}:${oap_12800}/graphql metrics exec --expression=meter_banyandb_endpoint_total_data --service-name=e2e-banyandb --endpoint-name=sw_metricsMinute expected: expected/metrics-has-value.yml - + - query: swctl --display yaml --base-url=http://${oap_host}:${oap_12800}/graphql metrics exec --expression=meter_banyandb_endpoint_queue_throughput --service-name=e2e-banyandb --endpoint-name=sw_metricsMinute + expected: expected/metrics-has-label-value.yml diff --git a/test/e2e-v2/cases/banyandb/docker-compose.yml b/test/e2e-v2/cases/banyandb/docker-compose.yml index 2b0d618a81..fd8468552b 100644 --- a/test/e2e-v2/cases/banyandb/docker-compose.yml +++ b/test/e2e-v2/cases/banyandb/docker-compose.yml @@ -13,25 +13,44 @@ # See the License for the specific language governing permissions and # limitations under the License. +# SWIP-15 BanyanDB self-observability e2e: a minimal CLUSTER (1 liaison + 1 hot data node), scraped +# WITHOUT the FODC proxy. The OTel collector scrapes each node's own :2121 Prometheus endpoint +# directly and injects the FODC-equivalent identity labels +# (cluster / container_name / node_role / node_type / pod_name) as static per-scrape-job labels +# (see otel-collector-config.yaml). The MAL rules read those tags regardless of origin, so the +# cluster / instance / endpoint rule set is exercised without a FODC deployment. +# +# BanyanDB 0.11+ is required: the FODC-proxy cluster observability and the queue / lifecycle metric +# families this feature reads were introduced after 0.10, and 0.11 uses file/DNS node discovery +# (no etcd). The image is pinned per-case to the latest 0.11-dev build the public demo runs +# (commit 8a1936ce9); the repo-wide ${SW_BANYANDB_COMMIT} is an older build that predates the +# queue/lifecycle metric families. services: - oap: + data-hot: extends: file: ../../script/docker-compose/base-compose.yml - service: oap - expose: - - 11800 - ports: - - "11800:11800" - - "12800:12800" + service: banyandb-data + image: "ghcr.io/apache/skywalking-banyandb:8a1936ce96653e89d3d13250a42abc6e3d42fae7-testing" + hostname: data-hot + command: data --node-discovery-mode=file --node-discovery-file-path=/etc/banyandb/nodes.yaml --node-labels type=hot + volumes: + - ./nodes.yaml:/etc/banyandb/nodes.yaml networks: - e2e - banyandb: + + liaison: extends: file: ../../script/docker-compose/base-compose.yml - service: banyandb - ports: - - "17913:17913" - - "2121:2121" + service: liaison + image: "ghcr.io/apache/skywalking-banyandb:8a1936ce96653e89d3d13250a42abc6e3d42fae7-testing" + command: liaison --node-discovery-mode=file --node-discovery-file-path=/etc/banyandb/nodes.yaml --data-node-selector type=hot + volumes: + - ./nodes.yaml:/etc/banyandb/nodes.yaml + depends_on: + data-hot: + condition: service_healthy + networks: + - e2e otel-collector: image: otel/opentelemetry-collector:${OTEL_COLLECTOR_VERSION} @@ -42,6 +61,27 @@ services: - ./otel-collector-config.yaml:/etc/otel-collector-config.yaml expose: - 55678 + depends_on: + liaison: + condition: service_healthy + data-hot: + condition: service_healthy + + oap: + extends: + file: ../../script/docker-compose/base-compose.yml + service: oap + environment: + SW_STORAGE: banyandb + SW_STORAGE_BANYANDB_TARGETS: "liaison:17912" + ports: + - "11800:11800" + - "12800:12800" + networks: + - e2e + depends_on: + liaison: + condition: service_healthy networks: e2e: diff --git a/test/e2e-v2/cases/banyandb/e2e.yaml b/test/e2e-v2/cases/banyandb/e2e.yaml index 965f8da653..96b4ba2371 100644 --- a/test/e2e-v2/cases/banyandb/e2e.yaml +++ b/test/e2e-v2/cases/banyandb/e2e.yaml @@ -44,7 +44,7 @@ cleanup: on: failure output-dir: $SW_INFRA_E2E_LOG_DIR/banyandb-data items: - - service: banyandb + - service: data-hot paths: - /tmp/trace/ - /tmp/stream/ @@ -52,3 +52,6 @@ cleanup: - /tmp/property/ - /tmp/schema-property/ - /tmp/accesslog/ + - service: liaison + paths: + - /tmp/accesslog/ diff --git a/test/e2e-v2/cases/banyandb/docker-compose.yml b/test/e2e-v2/cases/banyandb/expected/metrics-has-label-value.yml similarity index 52% copy from test/e2e-v2/cases/banyandb/docker-compose.yml copy to test/e2e-v2/cases/banyandb/expected/metrics-has-label-value.yml index 2b0d618a81..6fc7bffccc 100644 --- a/test/e2e-v2/cases/banyandb/docker-compose.yml +++ b/test/e2e-v2/cases/banyandb/expected/metrics-has-label-value.yml @@ -13,35 +13,28 @@ # See the License for the specific language governing permissions and # limitations under the License. -services: - oap: - extends: - file: ../../script/docker-compose/base-compose.yml - service: oap - expose: - - 11800 - ports: - - "11800:11800" - - "12800:12800" - networks: - - e2e - banyandb: - extends: - file: ../../script/docker-compose/base-compose.yml - service: banyandb - ports: - - "17913:17913" - - "2121:2121" - - otel-collector: - image: otel/opentelemetry-collector:${OTEL_COLLECTOR_VERSION} - networks: - - e2e - command: [ "--config=/etc/otel-collector-config.yaml" ] - volumes: - - ./otel-collector-config.yaml:/etc/otel-collector-config.yaml - expose: - - 55678 - -networks: - e2e: +# For labeled metric series (e.g. reporting_instances by container_name, queue throughput by +# operation): at least one result carries a non-empty label and at least one non-null value bucket. +debuggingtrace: null +type: TIME_SERIES_VALUES +results: + {{- contains .results }} + - metric: + labels: + {{- contains .metric.labels }} + - key: {{ notEmpty .key }} + value: {{ notEmpty .value }} + {{- end }} + values: + {{- contains .values }} + - id: {{ notEmpty .id }} + value: {{ notEmpty .value }} + traceid: null + owner: null + - id: {{ notEmpty .id }} + value: null + traceid: null + owner: null + {{- end }} + {{- end}} +error: null diff --git a/test/e2e-v2/cases/banyandb/otel-collector-config.yaml b/test/e2e-v2/cases/banyandb/nodes.yaml similarity index 59% copy from test/e2e-v2/cases/banyandb/otel-collector-config.yaml copy to test/e2e-v2/cases/banyandb/nodes.yaml index 30d45627d2..9c0602822a 100644 --- a/test/e2e-v2/cases/banyandb/otel-collector-config.yaml +++ b/test/e2e-v2/cases/banyandb/nodes.yaml @@ -13,36 +13,9 @@ # See the License for the specific language governing permissions and # limitations under the License. -receivers: - prometheus: - config: - scrape_configs: - - job_name: "banyandb-monitoring" - scrape_interval: 5s - static_configs: - - targets: ["banyandb:2121"] - labels: - host_name: server - -processors: - batch: - -exporters: - otlp: - endpoint: oap:11800 - tls: - insecure: true - debug: - verbosity: detailed - -service: - pipelines: - metrics: - receivers: - - prometheus - processors: - - batch - exporters: - - otlp - - +# Static node-discovery file for the BanyanDB 0.11 cluster (file discovery mode; no etcd). +nodes: + - name: data-hot + grpc_address: data-hot:17912 + - name: liaison + grpc_address: liaison:17912 diff --git a/test/e2e-v2/cases/banyandb/otel-collector-config.yaml b/test/e2e-v2/cases/banyandb/otel-collector-config.yaml index 30d45627d2..74016d2303 100644 --- a/test/e2e-v2/cases/banyandb/otel-collector-config.yaml +++ b/test/e2e-v2/cases/banyandb/otel-collector-config.yaml @@ -13,16 +13,36 @@ # See the License for the specific language governing permissions and # limitations under the License. +# No FODC proxy in this e2e: the collector scrapes each BanyanDB node's own :2121 Prometheus +# endpoint directly and injects the identity labels the SWIP-15 MAL rules key on. On a real +# FODC deployment these are stamped by the proxy; here they are static per-target scrape labels +# (the same mechanism the previous single-node e2e used to inject host_name, extended to two +# targets and the full identity set). The OTel Prometheus receiver maps the Prometheus `job` to +# service.name, which OAP's receiver maps back to the `job_name` tag the rules filter on; all +# other static labels arrive as datapoint attributes (tags). Hence: +# - one job_name "banyandb-monitoring" (matches the filter on all three rule files) +# - per-target: cluster (service key), container_name (role discriminator + instance key), +# node_role, pod_name, and node_type (data only; liaison Elvis-defaults it to 'n/a'). receivers: prometheus: config: scrape_configs: - - job_name: "banyandb-monitoring" + - job_name: "banyandb-monitoring" scrape_interval: 5s static_configs: - - targets: ["banyandb:2121"] + - targets: ["liaison:2121"] labels: - host_name: server + cluster: e2e-banyandb + container_name: liaison + node_role: ROLE_LIAISON + pod_name: banyandb-liaison-0 + - targets: ["data-hot:2121"] + labels: + cluster: e2e-banyandb + container_name: data + node_role: ROLE_DATA + node_type: hot + pod_name: banyandb-data-hot-0 processors: batch: @@ -32,8 +52,6 @@ exporters: endpoint: oap:11800 tls: insecure: true - debug: - verbosity: detailed service: pipelines: @@ -44,5 +62,3 @@ service: - batch exporters: - otlp - -
