caps tuning

wusheng Mon, 22 Jun 2026 05:50:58 -0700

This is an automated email from the ASF dual-hosted git repository.

wu-sheng pushed a commit to branch feat/performance-config
in repository https://gitbox.apache.org/repos/asf/skywalking-horizon-ui.git


commit e10ea87f1a9eb108cce9bb4babe1f665a2ffd64e
Author: Wu Sheng <[email protected]>
AuthorDate: Mon Jun 22 20:49:58 2026 +0800

    feat(config): performance section in horizon.yaml — relocate fan-out/caps 
tuning
    
    Operational tuning that was hardcoded in routes or misplaced inside 
published
    dashboard templates now lives in one operator-owned, hot-reloaded 
`performance`
    section in horizon.yaml. Pure relocation — defaults equal the prior built-in
    values, enforced by a new schema-default-vs-example drift test.
    
    - performance.bulk: per-route bulk size + concurrency for the topology /
      3D-map / landing / dashboard OAP fan-outs (was hardcoded 150/200/4, 6/8, 
6).
    - performance.limits: the service-map render valve (5000/15000) and 
per-request
      record caps for traces / logs / browser logs (maxPageSize).
    - The 3D map's metric fan-out moved out of its OAP-published template (the
      `pipeline` block) into performance.bulk.infra3d; the BFF injects it into 
the
      config response so the UI is unchanged, and a stale template still 
carrying
      `pipeline` is accept-and-ignored.
    - Unified page-size pickers (20/30/50/100) across Traces, Logs, and Browser
      Logs (Browser Logs gains a picker; the trace cap drops 200 -> 100 to 
match).
    - Dockerfile sets a default NODE_OPTIONS=--max-old-space-size; docs cover
      Node-heap sizing against the in-memory source-map budget.
    - Fixed an example.yaml rbac drift (roles were missing infra-3d:read); the 
new
      drift test keeps schema defaults and horizon.example.yaml byte-identical.
    
    Validated: BFF+UI type-check, lint, 124+113 tests, both builds, license-eye
    0 invalid; live demo-OAP smoke (topology + traces) unchanged.
---
 CHANGELOG.md                                       |  7 +-
 Dockerfile                                         |  4 +-
 .../bff/src/bundled_templates/infra-3d/config.json |  6 --
 apps/bff/src/config/schema.test.ts                 | 70 ++++++++++++++++++++
 apps/bff/src/config/schema.ts                      | 76 ++++++++++++++++++++++
 apps/bff/src/http/config/infra-3d.ts               | 16 ++++-
 apps/bff/src/http/query/browser-errors.ts          | 10 +--
 apps/bff/src/http/query/dashboard.ts               |  2 +-
 apps/bff/src/http/query/deployment.ts              |  5 +-
 apps/bff/src/http/query/endpoint-dependency.ts     |  5 +-
 apps/bff/src/http/query/instance-topology.ts       |  5 +-
 apps/bff/src/http/query/landing.ts                 | 12 ++--
 apps/bff/src/http/query/log.ts                     | 12 ++--
 apps/bff/src/http/query/topology.ts                | 15 ++---
 apps/bff/src/http/query/trace.ts                   | 26 ++++----
 apps/bff/src/logic/infra-3d/types.ts               | 14 ----
 apps/bff/src/logic/infra-3d/validate.ts            | 19 ++----
 .../browser-errors/LayerBrowserErrorsView.vue      | 13 +++-
 apps/ui/src/layer/logs/LayerLogsView.vue           |  1 +
 apps/ui/src/layer/traces/LayerTracesView.vue       |  3 +-
 apps/ui/src/layer/traces/LayerZipkinTracesView.vue |  3 +-
 docs/operate/infra-3d-map.md                       |  8 +--
 docs/setup/container-image.md                      | 15 +++++
 docs/setup/horizon-yaml.md                         | 60 +++++++++++++++++
 horizon.example.yaml                               | 40 +++++++++++-
 25 files changed, 358 insertions(+), 89 deletions(-)

diff --git a/CHANGELOG.md b/CHANGELOG.md
index 9dfdb63..b381898 100644
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -6,7 +6,12 @@ The version line is shared by every package in the monorepo 
(apps + shared packa
 
 ## 1.0.0
 
-(In development — fill in highlights here before cutting the release.)
+### Performance & behavior tuning
+
+- **New `performance` section in `horizon.yaml`.** Tune how hard the BFF fans 
metric queries out to OAP — per-route bulk (request) sizes and concurrency for 
the topology, 3D-map, landing, and dashboard fan-outs — plus protective caps: 
the service-map render valve (`topologyMaxNodes` / `topologyMaxEdges`) and 
per-request record caps for traces / logs / browser logs. Operational, 
hot-reloaded, per-deployment; defaults match the previous built-in values, so 
the whole block is optional. Rais [...]
+- **3D-map fan-out tuning moved out of the dashboard template into 
`horizon.yaml`** (`performance.bulk.infra3d`). These metric concurrency / batch 
knobs were operational settings misplaced in a published-to-OAP dashboard 
template (not even surfaced in the admin editor); a stale template still 
carrying the old `pipeline` block is now accepted and ignored.
+- **Unified page-size pickers across the event lists.** Traces, Logs, and 
Browser Logs share a `20 / 30 / 50 / 100` page-size dropdown — and Browser Logs 
gains a picker it never had (it had a fixed 100). Each picker's max matches the 
server-side fetch cap in `performance.limits.maxPageSize`.
+- **Node memory sizing guidance.** The container image now sets a default 
`NODE_OPTIONS=--max-old-space-size`, and the docs cover sizing the Node heap to 
your container memory limit and the in-memory source-map budget.
 
 ## 0.7.0
 
diff --git a/Dockerfile b/Dockerfile
index d77e34b..fe998dc 100644
--- a/Dockerfile
+++ b/Dockerfile
@@ -79,7 +79,9 @@ ENV NODE_ENV=production \
     HORIZON_SETUP_FILE=/data/horizon-setup.json \
     HORIZON_ALARMS_FILE=/data/horizon-alarms.json \
     HORIZON_WIRE_LOG_FILE=/data/horizon-wire.jsonl \
-    HORIZON_SOURCEMAPS_DIR=/app/sourcemaps
+    HORIZON_SOURCEMAPS_DIR=/app/sourcemaps \
+    # Match this to the container memory limit and your sourceMaps budget — 
the in-heap map cache lives inside it.
+    NODE_OPTIONS=--max-old-space-size=768
 
 USER horizon
 EXPOSE 8081
diff --git a/apps/bff/src/bundled_templates/infra-3d/config.json 
b/apps/bff/src/bundled_templates/infra-3d/config.json
index 4bb6e87..967a026 100644
--- a/apps/bff/src/bundled_templates/infra-3d/config.json
+++ b/apps/bff/src/bundled_templates/infra-3d/config.json
@@ -8,12 +8,6 @@
     "crossLevelCall": { "color": "#f0a04b", "style": "solid",  "arrow": true  
},
     "intraCall":      { "color": "rgba(255,255,255,0.4)", "style": "solid", 
"arrow": false }
   },
-  "pipeline": {
-    "metricChunkSize":     6,
-    "metricConcurrency":   4,
-    "topologyConcurrency": 4,
-    "templateConcurrency": 8
-  },
   "unknownLayer": {
     "level": "middleware",
     "badge": "unclassified"
diff --git a/apps/bff/src/config/schema.test.ts 
b/apps/bff/src/config/schema.test.ts
new file mode 100644
index 0000000..f742523
--- /dev/null
+++ b/apps/bff/src/config/schema.test.ts
@@ -0,0 +1,70 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+import { readFileSync } from 'node:fs';
+import { fileURLToPath } from 'node:url';
+import { dirname, resolve } from 'node:path';
+import { describe, expect, it } from 'vitest';
+import YAML from 'yaml';
+import { configSchema } from './schema.js';
+import { interpolateEnv } from './loader.js';
+
+describe('configSchema defaults', () => {
+  it('parses an empty object — every non-optional field has a default', () => {
+    expect(() => configSchema.parse({})).not.toThrow();
+  });
+});
+
+// Guard against horizon.example.yaml drifting from the schema defaults. The
+// example is "reference, not override" — every value it shows is meant to
+// equal what the BFF runs with when the block is omitted. If a default
+// changes (or someone edits the example to a non-default), this fails so the
+// two are reconciled before merge.
+describe('horizon.example.yaml matches schema defaults', () => {
+  const here = dirname(fileURLToPath(import.meta.url));
+  const examplePath = resolve(here, '../../../../horizon.example.yaml');
+  const example = YAML.parse(interpolateEnv(readFileSync(examplePath, 
'utf8'))) ?? {};
+  const defaults = configSchema.parse({}) as Record<string, unknown>;
+
+  // YAML omits a value as null; the schema models the same absence as the
+  // empty string (interpolated `${VAR:}`). Treat the two as equal so an
+  // unset path doesn't read as drift.
+  const norm = (v: unknown): unknown => (v === null || v === undefined ? '' : 
v);
+
+  // Walk only what the example actually declares; the example is allowed to
+  // omit fields (those fall back to defaults at runtime). Every scalar /
+  // array it DOES carry must match the parsed default at the same path.
+  const walk = (exVal: unknown, defVal: unknown, path: string): void => {
+    if (Array.isArray(exVal) || (exVal !== null && typeof exVal === 'object')) 
{
+      if (Array.isArray(exVal)) {
+        expect(defVal, `${path} should be an array in 
defaults`).toEqual(exVal);
+        return;
+      }
+      const exObj = exVal as Record<string, unknown>;
+      const defObj = (defVal ?? {}) as Record<string, unknown>;
+      for (const key of Object.keys(exObj)) {
+        walk(exObj[key], defObj[key], path ? `${path}.${key}` : key);
+      }
+      return;
+    }
+    expect(norm(exVal), `${path} drifted from schema 
default`).toEqual(norm(defVal));
+  };
+
+  it('every value present in the example equals the schema default', () => {
+    walk(example, defaults, '');
+  });
+});
diff --git a/apps/bff/src/config/schema.ts b/apps/bff/src/config/schema.ts
index 51f6fd6..c220d3b 100644
--- a/apps/bff/src/config/schema.ts
+++ b/apps/bff/src/config/schema.ts
@@ -385,6 +385,81 @@ const layersSchema = z
   .strict()
   .default({ excluded: DEFAULT_EXCLUDED_LAYERS });
 
+// ────────────────────────────────────────────────────────────────────
+// Performance / behavior tuning — how hard the BFF fans queries out to
+// OAP, plus the render / fetch caps that protect storage. OPERATIONAL,
+// per-deployment, hot-reloaded — NOT dashboard content (those live in
+// templates published to OAP). Defaults equal the built-in values, so
+// omitting this block changes nothing. Every value is clamped to a hard
+// ceiling (the `.max()` below) — config can lower, never exceed it.
+const performanceSchema = z
+  .object({
+    bulk: z
+      .object({
+        // Service-map family routes (topology / instance-topology /
+        // deployment / endpoint-dependency). `*BulkSize` = aliased MQE
+        // fragments per OAP request; `concurrency` = parallel requests.
+        topology: z
+          .object({
+            nodeBulkSize: z.number().int().min(1).max(500).default(150),
+            edgeBulkSize: z.number().int().min(1).max(500).default(200),
+            concurrency: z.number().int().min(1).max(16).default(4),
+          })
+          .strict()
+          .default({}),
+        // 3D infrastructure-map metric fan-out (relocated from the 3D
+        // template's former `pipeline` block).
+        infra3d: z
+          .object({
+            metricBulkSize: z.number().int().min(1).max(12).default(6),
+            metricConcurrency: z.number().int().min(1).max(8).default(4),
+            topologyConcurrency: z.number().int().min(1).max(16).default(4),
+            templateConcurrency: z.number().int().min(1).max(32).default(8),
+          })
+          .strict()
+          .default({}),
+        // Per-layer landing: metric columns fetched in service batches.
+        landing: z
+          .object({
+            bulkSize: z.number().int().min(1).max(12).default(6),
+            concurrency: z.number().int().min(1).max(16).default(8),
+          })
+          .strict()
+          .default({}),
+        // Dashboard widget metric fan-out.
+        dashboard: z
+          .object({
+            bulkSize: z.number().int().min(1).max(12).default(6),
+          })
+          .strict()
+          .default({}),
+      })
+      .strict()
+      .default({}),
+    limits: z
+      .object({
+        // Service-map render valve: a graph larger than this is rejected
+        // with a "narrow the scope" notice rather than drawn unreadably.
+        topologyMaxNodes: z.number().int().positive().default(5000),
+        topologyMaxEdges: z.number().int().positive().default(15000),
+        // Max RECORDS per request (the OAP storage LIMIT) for each event
+        // list — NOT a page count. The UI page-size picker maxes at the
+        // same value, so a client can't out-ask the dropdown.
+        maxPageSize: z
+          .object({
+            traces: z.number().int().min(1).max(500).default(100),
+            logs: z.number().int().min(1).max(500).default(100),
+            browserLogs: z.number().int().min(1).max(500).default(100),
+          })
+          .strict()
+          .default({}),
+      })
+      .strict()
+      .default({}),
+  })
+  .strict()
+  .default({});
+
 export const configSchema = z
   .object({
     server: serverSchema.default({}),
@@ -399,6 +474,7 @@ export const configSchema = z
     debugLog: debugLogSchema,
     query: querySchema,
     sourceMaps: sourceMapsSchema,
+    performance: performanceSchema,
     // Deprecated + ignored. The 3D-map config moved to OAP (a template kind);
     // the old file-backed `infra3d.file` knob is gone. Accepted here (rather
     // than rejected by `.strict()`) so an existing config carrying the block
diff --git a/apps/bff/src/http/config/infra-3d.ts 
b/apps/bff/src/http/config/infra-3d.ts
index be03c3f..84ac1f4 100644
--- a/apps/bff/src/http/config/infra-3d.ts
+++ b/apps/bff/src/http/config/infra-3d.ts
@@ -60,7 +60,21 @@ export function registerInfra3dConfigRoutes(
     { preHandler: auth },
     async (_req: FastifyRequest, reply: FastifyReply) => {
       const cfg = await resolveEffectiveConfig(deps);
-      return reply.send(cfg);
+      // The metric fan-out budget is OPERATIONAL (per-deployment, hot-
+      // reloaded), so it lives in horizon.yaml — NOT the published template.
+      // Inject it server-side so the UI keeps reading `cfg.pipeline.*`; this
+      // overrides any stale `pipeline` a hand-edited / imported template row
+      // might still carry (validate.ts accepts-and-ignores it).
+      const perf = deps.config.current.performance.bulk.infra3d;
+      return reply.send({
+        ...cfg,
+        pipeline: {
+          metricChunkSize: perf.metricBulkSize,
+          metricConcurrency: perf.metricConcurrency,
+          topologyConcurrency: perf.topologyConcurrency,
+          templateConcurrency: perf.templateConcurrency,
+        },
+      });
     },
   );
 }
diff --git a/apps/bff/src/http/query/browser-errors.ts 
b/apps/bff/src/http/query/browser-errors.ts
index 5e78455..0a86690 100644
--- a/apps/bff/src/http/query/browser-errors.ts
+++ b/apps/bff/src/http/query/browser-errors.ts
@@ -49,10 +49,12 @@ export interface BrowserErrorsRouteDeps {
 }
 
 const DEFAULT_WINDOW_MIN = 30;
-const MAX_PAGE_SIZE = 100;
-function clampPageSize(requested: number | undefined, fallback: number): 
number {
+/** OAP feeds `paging.pageSize` straight to storage as a LIMIT. The cap
+ *  is `performance.limits.maxPageSize.browserLogs` (default 100);
+ *  mirror that server-side so the cap holds against direct API callers. */
+function clampPageSize(requested: number | undefined, fallback: number, max: 
number): number {
   if (!Number.isFinite(requested as number) || (requested as number) < 1) 
return fallback;
-  return Math.min(MAX_PAGE_SIZE, Math.round(requested as number));
+  return Math.min(max, Math.round(requested as number));
 }
 
 function defaultWindow(
@@ -194,7 +196,7 @@ export function registerBrowserErrorsRoute(app: 
FastifyInstance, deps: BrowserEr
         queryDuration: withColdStage(req, { start: window.start, end: 
window.end, step: 'SECOND' }),
         paging: {
           pageNum: Math.max(1, Math.round(body.page ?? 1)),
-          pageSize: clampPageSize(body.pageSize, 50),
+          pageSize: clampPageSize(body.pageSize, 50, 
deps.config.current.performance.limits.maxPageSize.browserLogs),
         },
       };
 
diff --git a/apps/bff/src/http/query/dashboard.ts 
b/apps/bff/src/http/query/dashboard.ts
index 42b3dd7..fed6c03 100644
--- a/apps/bff/src/http/query/dashboard.ts
+++ b/apps/bff/src/http/query/dashboard.ts
@@ -792,7 +792,7 @@ export function registerDashboardQueryRoute(app: 
FastifyInstance, deps: Dashboar
       // round-trip while staying inside OAP's per-query budget.
       // Gate-skipped widgets are excluded here (their wIdx keeps its
       // original index so Step 3's result map still lines up).
-      const MAX_WIDGETS_PER_BATCH = 6;
+      const MAX_WIDGETS_PER_BATCH = 
cfgCurrent.performance.bulk.dashboard.bulkSize;
       const batchWidgets = widgets
         .map((widget, wIdx) => ({ widget, wIdx }))
         .filter(({ wIdx }) => !skipped.has(wIdx));
diff --git a/apps/bff/src/http/query/deployment.ts 
b/apps/bff/src/http/query/deployment.ts
index 34e65ec..b4461a7 100644
--- a/apps/bff/src/http/query/deployment.ts
+++ b/apps/bff/src/http/query/deployment.ts
@@ -284,6 +284,7 @@ export function registerDeploymentRoute(
       }
 
       const cfgCurrent = deps.config.current;
+      const perf = cfgCurrent.performance;
       const opts = buildOapOpts(cfgCurrent, deps.fetch);
       const offset = await getServerOffsetMinutes(deps.config, deps.fetch);
       // Honor the SPA's topbar picker triplet; else fall back to the
@@ -507,8 +508,8 @@ export function registerDeploymentRoute(
       // track failed metric chunks → surface "blank may be unavailable, not 
zero"
       const mstats = { failed: 0, total: 0 };
       const [nodeEnv, edgeEnv] = await Promise.all([
-        fetchAliasedChunks<MqeShape>(opts, nodeFragments, 150, 
'DeploymentNodeMetrics', 4, mstats),
-        fetchAliasedChunks<MqeShape>(opts, edgeFragments, 200, 
'DeploymentEdgeMetrics', 4, mstats),
+        fetchAliasedChunks<MqeShape>(opts, nodeFragments, 
perf.bulk.topology.nodeBulkSize, 'DeploymentNodeMetrics', 
perf.bulk.topology.concurrency, mstats),
+        fetchAliasedChunks<MqeShape>(opts, edgeFragments, 
perf.bulk.topology.edgeBulkSize, 'DeploymentEdgeMetrics', 
perf.bulk.topology.concurrency, mstats),
       ]);
 
       for (const [alias, shape] of Object.entries(nodeEnv)) {
diff --git a/apps/bff/src/http/query/endpoint-dependency.ts 
b/apps/bff/src/http/query/endpoint-dependency.ts
index f2052da..6005911 100644
--- a/apps/bff/src/http/query/endpoint-dependency.ts
+++ b/apps/bff/src/http/query/endpoint-dependency.ts
@@ -287,6 +287,7 @@ export function registerEndpointDependencyRoute(
       }
 
       const cfgCurrent = deps.config.current;
+      const perf = cfgCurrent.performance;
       const opts = buildOapOpts(cfgCurrent, deps.fetch);
       const offset = await getServerOffsetMinutes(deps.config, deps.fetch);
       // Honor the SPA's topbar picker triplet; else fall back to the
@@ -472,8 +473,8 @@ export function registerEndpointDependencyRoute(
       // track failed metric chunks → surface "blank may be unavailable, not 
zero"
       const mstats = { failed: 0, total: 0 };
       const [nodeEnv, edgeEnv] = await Promise.all([
-        fetchAliasedChunks<MqeShape>(opts, nodeFragments, 150, 
'EndpointMetrics', 4, mstats),
-        fetchAliasedChunks<MqeShape>(opts, edgeFragments, 200, 
'EndpointEdgeMetrics', 4, mstats),
+        fetchAliasedChunks<MqeShape>(opts, nodeFragments, 
perf.bulk.topology.nodeBulkSize, 'EndpointMetrics', 
perf.bulk.topology.concurrency, mstats),
+        fetchAliasedChunks<MqeShape>(opts, edgeFragments, 
perf.bulk.topology.edgeBulkSize, 'EndpointEdgeMetrics', 
perf.bulk.topology.concurrency, mstats),
       ]);
 
       for (const [alias, shape] of Object.entries(nodeEnv)) {
diff --git a/apps/bff/src/http/query/instance-topology.ts 
b/apps/bff/src/http/query/instance-topology.ts
index fb02f65..b940c5a 100644
--- a/apps/bff/src/http/query/instance-topology.ts
+++ b/apps/bff/src/http/query/instance-topology.ts
@@ -255,6 +255,7 @@ export function registerInstanceTopologyRoute(
       }
 
       const cfgCurrent = deps.config.current;
+      const perf = cfgCurrent.performance;
       const opts = buildOapOpts(cfgCurrent, deps.fetch);
       const offset = await getServerOffsetMinutes(deps.config, deps.fetch);
       // Honor the SPA's topbar picker triplet; else fall back to the
@@ -386,8 +387,8 @@ export function registerInstanceTopologyRoute(
       // track failed metric chunks → surface "blank may be unavailable, not 
zero"
       const mstats = { failed: 0, total: 0 };
       const [nodeEnv, edgeEnv] = await Promise.all([
-        fetchAliasedChunks<MqeShape>(opts, nodeFragments, 150, 
'InstanceNodeMetrics', 4, mstats),
-        fetchAliasedChunks<MqeShape>(opts, edgeFragments, 200, 
'InstanceEdgeMetrics', 4, mstats),
+        fetchAliasedChunks<MqeShape>(opts, nodeFragments, 
perf.bulk.topology.nodeBulkSize, 'InstanceNodeMetrics', 
perf.bulk.topology.concurrency, mstats),
+        fetchAliasedChunks<MqeShape>(opts, edgeFragments, 
perf.bulk.topology.edgeBulkSize, 'InstanceEdgeMetrics', 
perf.bulk.topology.concurrency, mstats),
       ]);
 
       for (const [alias, shape] of Object.entries(nodeEnv)) {
diff --git a/apps/bff/src/http/query/landing.ts 
b/apps/bff/src/http/query/landing.ts
index 5f176e7..0341aaf 100644
--- a/apps/bff/src/http/query/landing.ts
+++ b/apps/bff/src/http/query/landing.ts
@@ -97,8 +97,8 @@ const DEFAULT_WINDOW_MIN = 60;
 // The batches then drain through a bounded-concurrency pool so a large
 // layer fans out in controlled waves, not a thundering herd. The number of
 // services probed per request is itself bounded by `query.landingServiceCap`.
-const MAX_SERVICES_PER_BATCH = 6;
-const LANDING_BATCH_CONCURRENCY = 8;
+// Batch size + pool width are config-tunable via
+// `performance.bulk.landing.{bulkSize,concurrency}` (read in the handler).
 
 /** Run `fn` over `items` with at most `limit` promises in flight at once. */
 async function mapPool<T>(items: T[], limit: number, fn: (item: T) => 
Promise<void>): Promise<void> {
@@ -265,6 +265,8 @@ export function registerLandingRoute(app: FastifyInstance, 
deps: LandingRouteDep
       const cfg = parsed.data;
       const oapLayer = layerKey.toUpperCase();
       const cfgCurrent = deps.config.current;
+      const { bulkSize: maxServicesPerBatch, concurrency: batchConcurrency } =
+        cfgCurrent.performance.bulk.landing;
       const opts = buildOapOpts(cfgCurrent, deps.fetch);
       const offset = await getServerOffsetMinutes(deps.config, deps.fetch);
       // Honor the SPA's topbar time picker when all three triplet fields
@@ -353,10 +355,10 @@ export function registerLandingRoute(app: 
FastifyInstance, deps: LandingRouteDep
         const out = new Map<string, MqeResultShape>();
         if (svcList.length === 0 || !cols.some((c) => !!c.expression)) return 
out;
         const chunks: (typeof svcList)[] = [];
-        for (let i = 0; i < svcList.length; i += MAX_SERVICES_PER_BATCH) {
-          chunks.push(svcList.slice(i, i + MAX_SERVICES_PER_BATCH));
+        for (let i = 0; i < svcList.length; i += maxServicesPerBatch) {
+          chunks.push(svcList.slice(i, i + maxServicesPerBatch));
         }
-        await mapPool(chunks, LANDING_BATCH_CONCURRENCY, async (batch) => {
+        await mapPool(chunks, batchConcurrency, async (batch) => {
           const fragments: string[] = [];
           const back: { a: string; key: string }[] = [];
           batch.forEach((svc, li) => {
diff --git a/apps/bff/src/http/query/log.ts b/apps/bff/src/http/query/log.ts
index 0a9aa27..74cc6cf 100644
--- a/apps/bff/src/http/query/log.ts
+++ b/apps/bff/src/http/query/log.ts
@@ -53,12 +53,12 @@ export interface LogRouteDeps {
 
 const DEFAULT_WINDOW_MIN = 30;
 /** OAP feeds `paging.pageSize` straight to its storage layer as a
- *  LIMIT clause. The UI picker caps at 100; mirror that server-side so
- *  the cap holds against direct API callers. */
-const MAX_LOG_PAGE_SIZE = 100;
-function clampPageSize(requested: number | undefined, fallback: number): 
number {
+ *  LIMIT clause. The cap is `performance.limits.maxPageSize.logs`
+ *  (default 100); mirror that server-side so the cap holds against
+ *  direct API callers. */
+function clampPageSize(requested: number | undefined, fallback: number, max: 
number): number {
   if (!Number.isFinite(requested as number) || (requested as number) < 1) 
return fallback;
-  return Math.min(MAX_LOG_PAGE_SIZE, Math.round(requested as number));
+  return Math.min(max, Math.round(requested as number));
 }
 
 /** Build the log query window as SECOND-precision strings. Logs are
@@ -223,7 +223,7 @@ export function registerLogRoute(app: FastifyInstance, 
deps: LogRouteDeps): void
         queryDuration: withColdStage(req, { start: window.start, end: 
window.end, step: 'SECOND' }),
         paging: {
           pageNum: Math.max(1, Math.round(body.page ?? 1)),
-          pageSize: clampPageSize(body.pageSize, 50),
+          pageSize: clampPageSize(body.pageSize, 50, 
deps.config.current.performance.limits.maxPageSize.logs),
         },
       };
 
diff --git a/apps/bff/src/http/query/topology.ts 
b/apps/bff/src/http/query/topology.ts
index a718b9d..8594d3a 100644
--- a/apps/bff/src/http/query/topology.ts
+++ b/apps/bff/src/http/query/topology.ts
@@ -204,11 +204,6 @@ export function seriesFromMqe(env: MqeShape | undefined): 
Array<number | null> |
   });
 }
 
-// Safety valve: above this the graph can't render legibly and risks OOMing the
-// browser, so the route rejects with guidance rather than drawing a partial 
map.
-const TOPOLOGY_MAX_NODES = 5000;
-const TOPOLOGY_MAX_EDGES = 15000;
-
 function emptyResponse(
   layerKey: string,
   serviceArg: string | null,
@@ -307,6 +302,7 @@ export function registerTopologyRoute(app: FastifyInstance, 
deps: TopologyRouteD
       }
 
       const cfgCurrent = deps.config.current;
+      const perf = cfgCurrent.performance;
       const opts = buildOapOpts(cfgCurrent, deps.fetch);
       const offset = await getServerOffsetMinutes(deps.config, deps.fetch);
       // Honor the SPA's topbar time picker when all three triplet
@@ -425,7 +421,10 @@ export function registerTopologyRoute(app: 
FastifyInstance, deps: TopologyRouteD
 
       // Reject-with-guidance instead of a partial graph: too large to draw
       // legibly + risks OOMing the browser. UI shows a narrow-scope hint.
-      if (nodes.size > TOPOLOGY_MAX_NODES || calls.size > TOPOLOGY_MAX_EDGES) {
+      if (
+        nodes.size > perf.limits.topologyMaxNodes ||
+        calls.size > perf.limits.topologyMaxEdges
+      ) {
         return reply.send({
           ...emptyResponse(layerKey, serviceArg, depth, topoCfg, true),
           tooLarge: { nodes: nodes.size, edges: calls.size },
@@ -523,8 +522,8 @@ export function registerTopologyRoute(app: FastifyInstance, 
deps: TopologyRouteD
       // unavailable, not zero" rather than letting an OAP 5xx read as 
no-traffic.
       const mstats = { failed: 0, total: 0 };
       const [nodeEnv, edgeEnv] = await Promise.all([
-        fetchAliasedChunks<MqeShape>(opts, nodeFragments, 150, 'NodeMetrics', 
4, mstats),
-        fetchAliasedChunks<MqeShape>(opts, edgeFragments, 200, 'EdgeMetrics', 
4, mstats),
+        fetchAliasedChunks<MqeShape>(opts, nodeFragments, 
perf.bulk.topology.nodeBulkSize, 'NodeMetrics', perf.bulk.topology.concurrency, 
mstats),
+        fetchAliasedChunks<MqeShape>(opts, edgeFragments, 
perf.bulk.topology.edgeBulkSize, 'EdgeMetrics', perf.bulk.topology.concurrency, 
mstats),
       ]);
 
       for (const [alias, shape] of Object.entries(nodeEnv)) {
diff --git a/apps/bff/src/http/query/trace.ts b/apps/bff/src/http/query/trace.ts
index 9f03e0a..896cbb5 100644
--- a/apps/bff/src/http/query/trace.ts
+++ b/apps/bff/src/http/query/trace.ts
@@ -73,13 +73,13 @@ const DEFAULT_WINDOW_MIN = 30;
 const MAX_WINDOW_MIN = 60 * 24 * 7; // 1 week guard
 /** OAP feeds `paging.pageSize` straight to its storage layer as a
  *  LIMIT clause (PaginationUtils.java). A direct API caller could
- *  otherwise pass `pageSize: 100000` and exhaust the backend. The UI
- *  picker caps at 200 — match that server-side, allowing graceful
- *  defaulting when the body omits or mangles the field. */
-const MAX_TRACE_PAGE_SIZE = 200;
-function clampPageSize(requested: number | undefined, fallback: number): 
number {
+ *  otherwise pass `pageSize: 100000` and exhaust the backend. The cap
+ *  is `performance.limits.maxPageSize.traces` (default 100) — match the
+ *  UI picker server-side, allowing graceful defaulting when the body
+ *  omits or mangles the field. */
+function clampPageSize(requested: number | undefined, fallback: number, max: 
number): number {
   if (!Number.isFinite(requested as number) || (requested as number) < 1) 
return fallback;
-  return Math.min(MAX_TRACE_PAGE_SIZE, Math.round(requested as number));
+  return Math.min(max, Math.round(requested as number));
 }
 // Traces are RECORD-style data and have no metric-bucket cap on OAP
 // (`DurationUtils.MAX_TIME_RANGE` only applies to metric queries via
@@ -267,6 +267,7 @@ function buildTraceCondition(
   resolvedServiceId: string | null,
   w: { start: string; end: string },
   coldStage: boolean,
+  maxPageSize: number,
 ) {
   return {
     ...(resolvedServiceId ? { serviceId: resolvedServiceId } : {}),
@@ -289,7 +290,7 @@ function buildTraceCondition(
       // OAP forwards `pageSize` straight to storage as a LIMIT
       // (PaginationUtils.java). The UI picker caps at 200; mirror that
       // server-side so the cap holds against direct API callers.
-      pageSize: clampPageSize(body.pageSize, 20),
+      pageSize: clampPageSize(body.pageSize, 20, maxPageSize),
     },
   };
 }
@@ -300,6 +301,7 @@ async function fetchNativeList(
   layerKey: string,
   coldStage: boolean,
   offsetMinutes: number,
+  maxPageSize: number,
 ): Promise<NativeTraceListResponse> {
   const api = await detectTraceQueryApi(opts);
   // Explicit start+end takes precedence over windowMinutes; falling
@@ -322,7 +324,7 @@ async function fetchNativeList(
       error: err instanceof Error ? err.message : String(err),
     };
   }
-  const condition = buildTraceCondition(body, serviceId, window, coldStage);
+  const condition = buildTraceCondition(body, serviceId, window, coldStage, 
maxPageSize);
   try {
     if (api === 'queryTraces') {
       const env = await graphqlPost<{
@@ -383,13 +385,14 @@ async function fetchNativeList(
 async function fetchZipkinList(
   opts: GraphqlOptions,
   body: TraceListBody,
+  maxPageSize: number,
 ): Promise<ZipkinTraceListResponse> {
   try {
     const traces = await zipkinFetchTraces(opts, {
       serviceName: body.service,
       minDuration: body.minTraceDuration,
       maxDuration: body.maxTraceDuration,
-      limit: clampPageSize(body.pageSize, 20),
+      limit: clampPageSize(body.pageSize, 20, maxPageSize),
     });
     return { source: 'zipkin', traces, reachable: true };
   } catch (err) {
@@ -434,6 +437,7 @@ export function registerTraceRoutes(app: FastifyInstance, 
deps: TraceRouteDeps):
       const requestedSource: TraceSource = body.source ?? tracesCfg.source;
       const opts = buildOapOpts(deps.config.current, deps.fetch);
       const offset = await getServerOffsetMinutes(deps.config, deps.fetch);
+      const maxPageSize = 
deps.config.current.performance.limits.maxPageSize.traces;
 
       const wantNative = requestedSource === 'both' || requestedSource === 
'native';
       const wantZipkin = requestedSource === 'both' || requestedSource === 
'zipkin';
@@ -441,9 +445,9 @@ export function registerTraceRoutes(app: FastifyInstance, 
deps: TraceRouteDeps):
       // response — the UI's empty / error states cover each slot.
       const [native, zipkin] = await Promise.all([
         wantNative
-          ? fetchNativeList(opts, body, layerKey, !!req.coldStage, offset)
+          ? fetchNativeList(opts, body, layerKey, !!req.coldStage, offset, 
maxPageSize)
           : Promise.resolve(undefined),
-        wantZipkin ? fetchZipkinList(opts, body) : Promise.resolve(undefined),
+        wantZipkin ? fetchZipkinList(opts, body, maxPageSize) : 
Promise.resolve(undefined),
       ]);
 
       const response: TraceListResponse = {
diff --git a/apps/bff/src/logic/infra-3d/types.ts 
b/apps/bff/src/logic/infra-3d/types.ts
index cc97479..eb93a93 100644
--- a/apps/bff/src/logic/infra-3d/types.ts
+++ b/apps/bff/src/logic/infra-3d/types.ts
@@ -116,19 +116,6 @@ export interface InfraEdgeStyle {
   arrow: boolean;
 }
 
-export interface InfraPipelineLimits {
-  /** Service-bundles per MQE batch in stage 5. Mirrors the existing
-   *  landing / dashboard chunking constant (6) so the 3D map shares the
-   *  same OAP back-pressure profile. */
-  metricChunkSize: number;
-  /** Max concurrent metric-chunk requests in stage 5 (each still ≤ 
metricChunkSize services). */
-  metricConcurrency: number;
-  /** Max concurrent `getServicesTopology` calls in stage 3. */
-  topologyConcurrency: number;
-  /** Max concurrent `getLayerTemplate` calls in stage 2. */
-  templateConcurrency: number;
-}
-
 export interface Infra3dConfig {
   filter: {
     /** Global layer regex applied before levelling. Default `.*`. */
@@ -139,7 +126,6 @@ export interface Infra3dConfig {
     crossLevelCall: InfraEdgeStyle;
     intraCall: InfraEdgeStyle;
   };
-  pipeline: InfraPipelineLimits;
   /** Where to put OAP layers that don't appear in any level's explicit
    *  `layers` list and don't match any level's regex. The cube renders
    *  with a small `badge` chip so the admin notices. */
diff --git a/apps/bff/src/logic/infra-3d/validate.ts 
b/apps/bff/src/logic/infra-3d/validate.ts
index 6bf50e7..e6a61d2 100644
--- a/apps/bff/src/logic/infra-3d/validate.ts
+++ b/apps/bff/src/logic/infra-3d/validate.ts
@@ -105,19 +105,12 @@ const configSchema = z
         intraCall: edgeStyleSchema,
       })
       .strict(),
-    pipeline: z
-      .object({
-        // Cap matches the metrics route's MAX_SERVICES (infra-3d-metrics.ts):
-        // each metric chunk is one GraphQL request, and OAP's complexity
-        // ceiling 5xx's beyond 12 services. A larger chunk size makes every
-        // oversized request fail, so reject it at config-save time.
-        metricChunkSize: z.number().int().min(1).max(12),
-        // Concurrent chunks in flight (each still ≤ chunkSize); default 4 for 
older configs.
-        metricConcurrency: z.number().int().min(1).max(8).default(4),
-        topologyConcurrency: z.number().int().min(1).max(16),
-        templateConcurrency: z.number().int().min(1).max(32),
-      })
-      .strict(),
+    // Deprecated + ignored. The metric fan-out budget moved to horizon.yaml
+    // (performance.bulk.infra3d) — the config endpoint injects the live value
+    // server-side. Accepted here (rather than rejected by `.strict()`) so a
+    // stale saved / imported row that still carries the block keeps loading;
+    // the value is unused.
+    pipeline: z.unknown().optional(),
     unknownLayer: z
       .object({
         level: z.string().min(1),
diff --git a/apps/ui/src/layer/browser-errors/LayerBrowserErrorsView.vue 
b/apps/ui/src/layer/browser-errors/LayerBrowserErrorsView.vue
index 003e454..796b570 100644
--- a/apps/ui/src/layer/browser-errors/LayerBrowserErrorsView.vue
+++ b/apps/ui/src/layer/browser-errors/LayerBrowserErrorsView.vue
@@ -133,7 +133,7 @@ const endMsRef = computed<number | null>(() =>
 );
 const windowMinutesEffective = computed<number>(() => (isCustomRange.value ? 0 
: windowMinutes.value));
 const page = ref(1);
-const pageSize = ref(100);
+const pageSize = ref(30);
 // The query always pulls every category; the legend filters the stream
 // client-side (mirrors the Logs legend) so the chips can show full
 // per-category counts regardless of which one is selected.
@@ -202,7 +202,7 @@ watch(serviceName, () => {
   selectedVersionId.value = '';
   clearPage();
 });
-watch([serviceName, windowMinutes, customStart, customEnd, selectedVersionId, 
selectedPageId], () => {
+watch([serviceName, windowMinutes, customStart, customEnd, selectedVersionId, 
selectedPageId, pageSize], () => {
   page.value = 1;
 });
 // Collapse the open row + its resolution whenever a fresh result set
@@ -488,6 +488,15 @@ function loc(row: BrowserErrorRow): string {
             <option :value="CUSTOM_RANGE_SENTINEL">{{ t('Custom…') }}</option>
           </select>
         </label>
+        <label class="cf">
+          <span>{{ t('Page size') }}</span>
+          <select v-model.number="pageSize" class="cf-input">
+            <option :value="20">20</option>
+            <option :value="30">30</option>
+            <option :value="50">50</option>
+            <option :value="100">100</option>
+          </select>
+        </label>
       </div>
       <SourceMapManager
         v-if="showMaps"
diff --git a/apps/ui/src/layer/logs/LayerLogsView.vue 
b/apps/ui/src/layer/logs/LayerLogsView.vue
index a254a5c..38bc826 100644
--- a/apps/ui/src/layer/logs/LayerLogsView.vue
+++ b/apps/ui/src/layer/logs/LayerLogsView.vue
@@ -750,6 +750,7 @@ function jumpToTrace(traceId: string, ts?: number): void {
           <span>Page size</span>
           <select v-model.number="pageSize" class="cf-input">
             <option :value="20">20</option>
+            <option :value="30">30</option>
             <option :value="50">50</option>
             <option :value="100">100</option>
           </select>
diff --git a/apps/ui/src/layer/traces/LayerTracesView.vue 
b/apps/ui/src/layer/traces/LayerTracesView.vue
index 89a4571..9865aa9 100644
--- a/apps/ui/src/layer/traces/LayerTracesView.vue
+++ b/apps/ui/src/layer/traces/LayerTracesView.vue
@@ -1020,11 +1020,10 @@ onBeforeUnmount(() => 
window.removeEventListener('keydown', onPageKeyDown, true)
           <label class="cf" :title="t('Cap on trace rows returned (default 
30).')">
             <span>{{ t('Limit') }}</span>
             <select v-model.number="limit" class="cf-input">
-              <option :value="10">10</option>
+              <option :value="20">20</option>
               <option :value="30">30</option>
               <option :value="50">50</option>
               <option :value="100">100</option>
-              <option :value="200">200</option>
             </select>
           </label>
           <label class="cf" :class="{ 'cf-wide': isCustomRange }">
diff --git a/apps/ui/src/layer/traces/LayerZipkinTracesView.vue 
b/apps/ui/src/layer/traces/LayerZipkinTracesView.vue
index c064b27..eb4792a 100644
--- a/apps/ui/src/layer/traces/LayerZipkinTracesView.vue
+++ b/apps/ui/src/layer/traces/LayerZipkinTracesView.vue
@@ -569,11 +569,10 @@ function openByInput(): void {
         <label class="cf">
           <span>{{ t('Limit') }}</span>
           <select v-model.number="limit" class="cf-input">
-            <option :value="10">10</option>
+            <option :value="20">20</option>
             <option :value="30">30</option>
             <option :value="50">50</option>
             <option :value="100">100</option>
-            <option :value="200">200</option>
           </select>
         </label>
         <!-- Time range pinned to its own final row so the (optional)
diff --git a/docs/operate/infra-3d-map.md b/docs/operate/infra-3d-map.md
index 92f0201..c43cb34 100644
--- a/docs/operate/infra-3d-map.md
+++ b/docs/operate/infra-3d-map.md
@@ -187,12 +187,12 @@ to publish. Import never writes OAP directly, and a file 
that isn't a valid
 
 ### Tuning the metric fan-out
 
-The Metrics step loads each layer's traffic numbers in batches, several at 
once. How aggressively it does this is governed by a small `pipeline` block in 
the map configuration. These fields are **not** surfaced in the structured 
editor — they are tuned only by editing the exported configuration JSON and 
importing it back (or by hand-editing the bundled default before deploying):
+The map's loading stages run in batches, several requests at once. How 
aggressively they do this is governed by the `performance.bulk.infra3d` block 
in [`horizon.yaml`](../setup/horizon-yaml.md#performance-tuning) — an operator 
setting, not part of the map configuration, so it is **not** in the structured 
editor and does **not** travel with an exported / imported map. Edit 
`horizon.yaml`; the change is hot-reloaded and takes effect the next time the 
map is opened:
 
 - `metricConcurrency` — how many metric batches load at the same time. Default 
`4`, range `1`–`8`. Raise it to fill the cubes faster on a large deployment 
when OAP has headroom; lower it (toward `1`) if a busy OAP rejects or slows the 
burst of metric requests during the Metrics step.
-- `metricChunkSize` — how many services share one metric request. Range 
`1`–`12`. Larger chunks mean fewer requests, but OAP rejects an oversized 
request, so this is capped — leave it at the default unless you have a reason 
to change it.
-- `topologyConcurrency` — how many layer call-graphs load at once during the 
Topologies step. Range `1`–`16`.
-- `templateConcurrency` — how many layer templates load at once during the 
Templates step. Range `1`–`32`.
+- `metricBulkSize` — how many services share one metric request. Default `6`, 
range `1`–`12`. Larger means fewer requests, but OAP rejects an oversized 
request, so this is capped — leave it at the default unless you have a reason 
to change it.
+- `topologyConcurrency` — how many layer call-graphs load at once during the 
Topologies step. Default `4`, range `1`–`16`.
+- `templateConcurrency` — how many layer templates load at once during the 
Templates step. Default `8`, range `1`–`32`.
 
 The defaults are tuned for a typical deployment; only revisit these if the 
loading timeline stalls on the Metrics, Topologies, or Templates step, or if 
OAP returns errors under the load.
 
diff --git a/docs/setup/container-image.md b/docs/setup/container-image.md
index a2f4921..7053553 100644
--- a/docs/setup/container-image.md
+++ b/docs/setup/container-image.md
@@ -54,6 +54,21 @@ The four `HORIZON_*_FILE` env vars seed the **defaults** the 
config schema uses
 
 `server.host` and `server.port` come from the YAML when present. If they are 
omitted, the image supplies defaults via `HORIZON_SERVER_HOST=0.0.0.0` and 
`HORIZON_SERVER_PORT=8081`. The image sets `EXPOSE 8081`; if you change 
`server.port`, also publish the new port.
 
+## Memory & sizing
+
+The BFF holds its **source-map cache in the Node heap** — uploaded 
Browser-Logs maps live in process memory, not in OAP — so the container's 
memory limit and Node's heap limit must be sized together with the source-map 
budget.
+
+- Set **`NODE_OPTIONS=--max-old-space-size=<MB>`** to match the container 
memory limit (leave headroom for the rest of the process — a value somewhat 
below the container limit, e.g. `1536` for a 2 GiB container). 
`--max-old-space-size` is a **process flag read by V8 before any config 
loads**, so it is **not** a `horizon.yaml` field — pass it via `NODE_OPTIONS` 
(env), not in the YAML.
+- Size **`sourceMaps.maxTotalBytes`** to fit comfortably inside that heap. A 
few recently-resolved maps are also kept *parsed* (larger than the raw file), 
so budget roughly 2× headroom above `maxTotalBytes`. Mounted (static) maps are 
disk-backed and don't count against the heap. See [Browser Logs & Source 
Maps](../operate/browser-source-maps.md).
+
+```sh
+docker run -d --name horizon \
+  -p 8081:8081 \
+  -e NODE_OPTIONS=--max-old-space-size=1536 \
+  -v "$PWD/horizon.yaml:/app/horizon.yaml:ro" \
+  ghcr.io/apache/skywalking-horizon-ui:0.7.0
+```
+
 ## How to load `horizon.yaml` into the container
 
 Three common approaches.
diff --git a/docs/setup/horizon-yaml.md b/docs/setup/horizon-yaml.md
index 99a208c..7d83769 100644
--- a/docs/setup/horizon-yaml.md
+++ b/docs/setup/horizon-yaml.md
@@ -16,6 +16,7 @@ This page is the top-level map. Each subsection has its own 
detail page:
 | `debugLog` | Wire-level request/response log for troubleshooting. | 
[debugLog](debug-log.md) |
 | `query` | Per-request query limits (the layer-landing service cap). | 
[below](#query-limits) |
 | `sourceMaps` | In-memory source-map budgets + static mount for the Browser 
Logs tab. | [Browser Logs & Source Maps](../operate/browser-source-maps.md) |
+| `performance` | How hard the BFF fans queries out to OAP, plus render / 
per-request record caps. | [below](#performance-tuning) |
 | `layers` | Layers to hide from the sidebar. | [below](#excluded-layers) |
 
 ## Top-level shape
@@ -48,6 +49,18 @@ setup:   { file? }
 alarms:  { file? }
 debugLog: { enabled?, file?, maxBodyChars?, redactAuthHeaders? }
 sourceMaps: { enabled?, maxFileBytes?, maxTotalBytes?, maxFileCount?, 
bootMountDir? }
+
+performance:
+  bulk:
+    topology:  { nodeBulkSize?, edgeBulkSize?, concurrency? }
+    infra3d:   { metricBulkSize?, metricConcurrency?, topologyConcurrency?, 
templateConcurrency? }
+    landing:   { bulkSize?, concurrency? }
+    dashboard: { bulkSize? }
+  limits:
+    topologyMaxNodes?: number
+    topologyMaxEdges?: number
+    maxPageSize: { traces?, logs?, browserLogs? }
+
 layers:  { excluded?: [{ key, reason? }] }
 ```
 
@@ -135,6 +148,53 @@ cap and pair it with a tighter OAP rate limit.
 
 Hot-reloadable — a change takes effect on the next landing request.
 
+## Performance tuning
+
+```yaml
+performance:
+  bulk:
+    topology:  { nodeBulkSize: 150, edgeBulkSize: 200, concurrency: 4 }
+    infra3d:   { metricBulkSize: 6, metricConcurrency: 4, topologyConcurrency: 
4, templateConcurrency: 8 }
+    landing:   { bulkSize: 6, concurrency: 8 }
+    dashboard: { bulkSize: 6 }
+  limits:
+    topologyMaxNodes: 5000
+    topologyMaxEdges: 15000
+    maxPageSize: { traces: 100, logs: 100, browserLogs: 100 }
+```
+
+The `performance` block tunes how hard Horizon drives your OAP and storage 
backend. **Every default equals the built-in value, so the whole block is 
optional** — omit it and Horizon behaves exactly as it does without it. Every 
value is also **clamped to a hard ceiling**: a number above the ceiling is 
pulled back down to it (config can only lower the load below a built-in limit, 
never raise it past one). Hot-reloadable — a change takes effect on the next 
request of that kind.
+
+The rule of thumb: **raise these on a beefy OAP with a fast storage backend** 
that can absorb more parallel queries (you'll fill pages and maps faster); 
**lower them on a modest deployment** where a busy OAP rejects or slows under 
the burst.
+
+### `performance.bulk` — query fan-out
+
+These govern how Horizon batches and parallelizes its metric queries to OAP. 
Each family has a **bulk size** (how many metric expressions ride in one OAP 
request — fewer, larger requests vs. more, smaller ones) and most have a 
**concurrency** (how many of those requests are in flight at once).
+
+| Section | Tunes | Defaults |
+|---|---|---|
+| `bulk.topology` | The service-map family (topology, instance topology, 
deployment, endpoint dependency) node/edge metric fan-out. | `nodeBulkSize: 
150`, `edgeBulkSize: 200`, `concurrency: 4` |
+| `bulk.infra3d` | The 3D Infrastructure Map's metric, topology, and template 
loading. | `metricBulkSize: 6`, `metricConcurrency: 4`, `topologyConcurrency: 
4`, `templateConcurrency: 8` |
+| `bulk.landing` | The per-layer landing's service-column metric batches. | 
`bulkSize: 6`, `concurrency: 8` |
+| `bulk.dashboard` | A dashboard's widget metric fan-out. | `bulkSize: 6` |
+
+- **Raise `concurrency` / `*Concurrency`** to load a large topology, 3D map, 
landing, or dashboard faster when OAP has headroom. **Lower it** (toward `1`) 
if OAP rejects or slows under the burst of parallel requests.
+- **Bulk sizes** trade request count against request size: a larger bulk means 
fewer, fatter OAP requests. OAP rejects an oversized request, so each bulk size 
is capped — leave it at the default unless you have a specific reason to change 
it.
+- For the 3D map specifically, these knobs are also described in context on 
the [3D Infrastructure Map](../operate/infra-3d-map.md) page.
+
+### `performance.limits` — render & record caps
+
+| Field | Caps | Default |
+|---|---|---|
+| `topologyMaxNodes` | The render valve for a service map — a graph with more 
nodes than this is **rejected with a "narrow the scope" notice** rather than 
drawn as an unreadable hairball. | `5000` |
+| `topologyMaxEdges` | The same valve on edges. | `15000` |
+| `maxPageSize.traces` | The maximum **records** fetched per Traces request 
(the storage `LIMIT`, not a page count). The page-size picker on the page maxes 
at this same value, so a client can't out-ask the dropdown. | `100` |
+| `maxPageSize.logs` | The same per-request record cap for Logs. | `100` |
+| `maxPageSize.browserLogs` | The same per-request record cap for Browser 
Logs. | `100` |
+
+- **`topologyMaxNodes` / `topologyMaxEdges`** are a readability and safety 
valve, not a data limit — if your deployment legitimately has a graph this 
large, raising them lets it render (at the cost of a denser scene and a heavier 
draw). Lower them if you'd rather force operators to scope down sooner.
+- **`maxPageSize.*`** bound how many rows one Traces / Logs / Browser-Logs 
request pulls from storage. Some storage backends fail or slow on large list 
queries — lower these to keep list pages cheap on a constrained backend; raise 
them (up to the ceiling) if your backend serves big result sets comfortably and 
operators want more rows per fetch.
+
 ## Excluded layers
 
 ```yaml
diff --git a/horizon.example.yaml b/horizon.example.yaml
index 1c9d8a3..7bdd2ef 100644
--- a/horizon.example.yaml
+++ b/horizon.example.yaml
@@ -148,6 +148,7 @@ rbac:
       - topology:read
       - profile:read
       - overview:read
+      - infra-3d:read
 
     # Viewer + platform monitoring (OAP cluster + module inspector).
     maintainer:
@@ -160,9 +161,10 @@ rbac:
       - profile:read
       - overview:read
       - cluster:read
+      - inspect:read
       - ttl:read
       - config:read
-      - inspect:read
+      - infra-3d:read
 
     # Configures observability: dashboards, alarm rules, DSL/OAL,
     # diagnostics, profiling. Inherits viewer + platform reads so the
@@ -177,9 +179,9 @@ rbac:
       - topology:read
       - profile:read
       - cluster:read
+      - inspect:read
       - ttl:read
       - config:read
-      - inspect:read
       - overview:read
       - overview:write
       - setup:read
@@ -190,6 +192,7 @@ rbac:
       - alarm-setup:write
       - alarm-rule:read
       - alarm-rule:write
+      - infra-3d:read
       - rule:read
       - rule:write
       - rule:write:structural
@@ -247,3 +250,36 @@ sourceMaps:
   # published image sets HORIZON_SOURCEMAPS_DIR=/app/sourcemaps; leave empty
   # to disable the static mount.
   bootMountDir: ${HORIZON_SOURCEMAPS_DIR:}
+
+# ────────────────────────────────────────────────────────────────────
+# Performance / behavior tuning — how hard the BFF fans queries out to
+# OAP, and the caps that protect storage. OPERATIONAL (per-deployment,
+# hot-reloaded, never published to OAP), unlike dashboard content, which
+# lives in templates. The whole block is optional — defaults equal the
+# built-in values, shown here for reference. Every value is clamped to a
+# hard ceiling; config can lower it, never raise it past that.
+#
+# Node heap: the BFF holds the source-map cache (above) in process memory,
+# so size the container memory limit and `NODE_OPTIONS=--max-old-space-size`
+# to your sourceMaps budget. (--max-old-space-size is a process flag, not a
+# config field — V8 reads it before this file loads.)
+performance:
+  bulk:
+    # Service-map family (topology / instance-topology / deployment /
+    # endpoint-dependency): bulkSize = aliased MQE fragments per OAP
+    # request; concurrency = parallel requests.
+    topology:   { nodeBulkSize: 150, edgeBulkSize: 200, concurrency: 4 }
+    # 3D infrastructure-map metric fan-out (was the 3D template `pipeline`).
+    infra3d:    { metricBulkSize: 6, metricConcurrency: 4, 
topologyConcurrency: 4, templateConcurrency: 8 }
+    # Per-layer landing metric-column batches.
+    landing:    { bulkSize: 6, concurrency: 8 }
+    # Dashboard widget metric fan-out.
+    dashboard:  { bulkSize: 6 }
+  limits:
+    # Service-map render valve — a larger graph is rejected with a
+    # "narrow the scope" notice rather than drawn unreadably.
+    topologyMaxNodes: 5000
+    topologyMaxEdges: 15000
+    # Max RECORDS per request for each event list (the OAP storage LIMIT)
+    # — not a page count; the UI picker maxes at the same value.
+    maxPageSize: { traces: 100, logs: 100, browserLogs: 100 }

(skywalking-horizon-ui) 01/01: feat(config): performance section in horizon.yaml — relocate fan-out/caps tuning

Reply via email to