Re: [PR] [doc](security) Add threat model for scans [doris]

via GitHub Wed, 20 May 2026 04:15:32 -0700


github-actions[bot] commented on code in PR #63447:
URL: https://github.com/apache/doris/pull/63447#discussion_r3273437626



##########
threat-model.md:
##########
@@ -0,0 +1,806 @@
+# Apache Doris — Threat Model
+
+> **Status: v1.0 — accepted (technical content). Pending wave-4 process
+> items.** Wave-1/2/3/4 maintainer interviews completed 2026-05-14
+> (Doris committer morningman). All technical `(inferred)` tags from
+> v0.1 have been resolved or consciously deferred.
+
+This document is the **security contract** for Apache Doris: what the
+project assumes, what it guarantees given those assumptions, what it
+explicitly leaves to the operator, and how a vulnerability triager
+should classify any inbound report.
+
+---
+
+## 4.1 Header
+
+- **Project**: Apache Doris (https://doris.apache.org)
+- **Model version binding**: written against `master` at commit
+  `1d1846591f7`, 2026-05-14. Per M15 (single-living-doc policy), the

Review Comment:
   This says triage is keyed by a `model-version` field at the top of the file, 
but the file being added does not contain such a field. Since this PR also 
updates `AGENTS.md` to make `threat-model.md` the first document used for 
vulnerability triage, leaving the version identifier as a follow-up makes the 
version-binding rule impossible to apply or automate for this initial accepted 
v1.0 document. Please add the top-level `model-version` field now, or weaken 
this paragraph until the field exists.



##########
threat-model.md:
##########
@@ -0,0 +1,806 @@
+# Apache Doris — Threat Model
+
+> **Status: v1.0 — accepted (technical content). Pending wave-4 process
+> items.** Wave-1/2/3/4 maintainer interviews completed 2026-05-14
+> (Doris committer morningman). All technical `(inferred)` tags from
+> v0.1 have been resolved or consciously deferred.
+
+This document is the **security contract** for Apache Doris: what the
+project assumes, what it guarantees given those assumptions, what it
+explicitly leaves to the operator, and how a vulnerability triager
+should classify any inbound report.
+
+---
+
+## 4.1 Header
+
+- **Project**: Apache Doris (https://doris.apache.org)
+- **Model version binding**: written against `master` at commit
+  `1d1846591f7`, 2026-05-14. Per M15 (single-living-doc policy), the
+  `model-version` field at the top of this file is bumped per minor
+  release; vulnerability reports against project version *N* are
+  triaged against the model as it stood at *N* (read the file at the
+  matching git tag).
+- **Reporting cross-reference**: per M1, security findings should be
+  reported to **`[email protected]`** (ASF security team will route
+  to Doris). A short `SECURITY.md` at the repo root will link to this
+  document as canonical scope (M16 (A)). Findings that fall under
+  §4.3 / §4.9 / §4.11a will be closed with a citation to this
+  document.
+- **Status**: v1.0 — technical model accepted. The four wave-4 (M15–M18)
+  meta/process answers are recorded below; physical artifacts
+  (`SECURITY.md`, model-version field policy text) are follow-up work.
+- **Provenance legend**:
+  - *(documented)* — stated in Doris' own README, code comments,
+    `conf/*.conf`, or user docs
+  - *(maintainer, Qn)* / *(maintainer, Mn)* — answered by a Doris
+    committer in interviews on 2026-05-14. `Q1`–`Q8` are wave-1/2
+    questions; `M1`–`M18` are wave-3/4 questions
+  - *(inferred)* — producer's working hypothesis, not yet ratified.
+    **None remain in v1.0.**
+- **Draft confidence**: *2 documented / 88 maintainer / 0 inferred*.
+  Up from v0.1 (2 / 45 / 37); all 14 wave-3 and 4 wave-4 questions
+  resolved on 2026-05-14.
+
+**One-paragraph project description.** Apache Doris is an MPP
+analytical (OLAP) database. Clients submit SQL over the MySQL wire
+protocol or Arrow Flight; queries are parsed and planned by the **FE**
+(Frontend, Java) and executed by the **BE** (Backend, C++) against
+locally managed columnar storage and/or external lakehouse catalogs
+(Hive, Iceberg, Hudi, Paimon, JDBC, S3/HDFS/Azure). Doris ships in two
+deployment shapes: classic on-prem (FE+BE+Broker), and the
+cloud-native **`cloud/`** variant (storage-compute disaggregated,
+shared Meta Service, K8s-native, multi-tenant). Both are in-model;
+content that differs is marked `[on-prem]` / `[cloud]`.
+
+---
+
+## 4.2 Scope and intended use
+
+**Primary intended use** — In-cluster MPP execution of analytical SQL,
+where the cluster is operated by the same organization that controls
+its network perimeter *(maintainer, Q2)*.
+
+**Deployment shapes in scope** *(maintainer, Q2)*:
+- **(A) On-prem / single-tenant** — FE+BE+Broker processes inside a
+  corporate network or private VPC. Cluster-internal network is
+  *implicitly trusted* by operator-provided isolation. **Default
+  shape.**
+- **(B) Cloud variant** — `cloud/` directory; storage-compute
+  disaggregated, K8s-native. **Tenancy model**: Meta Service is
+  shared across tenants; **per-tenant isolation enforced inside
+  Meta Service is a security claim of the project** *(maintainer,
+  M2)*. Cross-tenant data leak / privilege escalation through Meta
+  Service is `VALID`, not `OUT-OF-MODEL`.
+
+**Deployment shape explicitly OUT of scope** *(maintainer, Q2)*:
+- Direct internet exposure of any Doris-listened port. Collapses §4.4
+  trust model.
+
+**Caller roles**:
+
+| Role | Trust level | In §4.7? |
+|---|---|---|
+| Anonymous network attacker on client-facing ports (MySQL 9030, HTTP 8030, FE 
Arrow Flight 8070, **BE Arrow Flight 8050**) | Untrusted | **Yes — primary 
pre-auth adversary** |
+| Authenticated SQL user with limited RBAC privileges | Untrusted within RBAC 
scope | **Yes — primary post-auth adversary** |
+| Authenticated user holding `CREATE CATALOG` (sub-admin) | Untrusted within 
RBAC; can attach external URL endpoints | **Yes** *(maintainer, M13)* — narrow 
SSRF actor; see §4.9 |
+| Authenticated user in tenant T₁ trying to reach tenant T₂ data `[cloud]` | 
Untrusted across tenant boundary | **Yes** *(maintainer, M2)* — cross-tenant 
adversary |
+| `SUPER` / `ADMIN_PRIV` / database owner / operator-level user | Trusted 
*(maintainer, M3)* | No |
+| Cluster-internal RPC peer (FE↔BE, BE↔BE, FE↔Follower, FE↔Broker, 
FE↔MetaService) | Trusted by network isolation *(maintainer, Q1)* | No |
+| External catalog / storage system (Hive Metastore, Iceberg, JDBC source, S3, 
HDFS, Azure Blob) | Trusted by admin connection *(maintainer, Q8)* | No |
+
+**Component-family table.** Distinct threat profiles. `Surface` lists
+ports / inputs each family exposes; `In model?` ties to §4.3.
+
+| # | Family | Path | Surface | In model? |
+|---|---|---|---|---|
+| 1 | **FE core** (Java) | `fe/fe-core/` | MySQL 9030, HTTP 8030, FE Arrow 
Flight 8070 (client); RPC 9020, Edit-log 9010 (internal) | **Yes** |
+| 2 | **BE core** (C++) | `be/src/` | **BE Arrow Flight 8050 (client-facing, 
M7)**; BRPC 8060, Webserver 8040, Heartbeat 9050, BE↔BE 9060 (internal) | 
**Yes** |
+| 3 | **Cloud variant** | `cloud/src/` | Meta Service (shared, multi-tenant), 
Recycler, Resource Manager | **Yes** *(maintainer, Q2, M2)* |
+| 4 | **FE auth providers** | `fe/fe-authentication/` | Pluggable: native, 
LDAP | **Yes** |
+| 5 | **FE connectors** (catalogs) | 
`fe/fe-connector/{iceberg,hudi,hms,jdbc,paimon,trino,maxcompute,es}` | Outbound 
to external systems; in-process JAR loading | **Yes** (memory safety only; data 
trusted per §4.6) |
+| 6 | **BE Java extensions** | `fe/be-java-extensions/` | In-process JVM in BE 
| **Yes** (memory safety; UDF code trusted per §4.6) |
+| 7 | **HDFS / FS broker** | `fs_brokers/apache_hdfs_broker/` | Thrift RPC 
(cluster-internal) | **Yes** (internal trust per §4.4) |
+| 8 | Web UI | `ui/` + `webroot/` | Served via FE 8030 (auth gated) | **Yes** |
+| 9 | Vendored MySQL source | `mysql/mysql-{9.4.0,9.5.0}/` | None (reference 
only, not built or shipped) | **No** *(maintainer, Q4)* |
+| 10 | Sample / dev / CI | `samples/`, `docker/`, `pytest/`, 
`regression-test/`, `jdbc-version-test/`, `task_executor_simulator/`, `hooks/`, 
`build-support/` | None at runtime | **No** *(maintainer, Q4)* |
+| 11 | All FE plugins | `fe_plugins/` (`auditdemo`, `auditloader`, 
`sparksql-converter`, `trino-converter`) | FE plugin SPI | **No** *(maintainer, 
Q4, M4)* — `auditloader` included; users opting in take ownership |
+| 12 | Client SDKs / extensions / CDC client | `sdk/`, `extension/`, 
`cdc_client` | Client-side libraries | **No** *(maintainer, Q4)* — separately 
versioned |
+
+---
+
+## 4.3 Out of scope (explicit non-goals)
+
+**Use cases not supported.**
+
+1. **Direct internet exposure of any port** *(maintainer, Q2)*.
+   Operators must place Doris behind a network perimeter (VPC,
+   firewall, K8s `NetworkPolicy`, equivalent).
+2. **Cluster-internal-network adversary** *(maintainer, Q1)*. The
+   trust boundary sits at the client-facing ports (§4.4). An attacker
+   reaching BE BRPC 8060, BE Webserver 8040, FE Edit-log 9010, FE RPC
+   9020, BE Heartbeat 9050, BE↔BE 9060, or the FS broker is presumed
+   to have already compromised the operator's network.
+3. **DoS via pathological SQL or query plans** *(maintainer, Q5)*.
+   A single authenticated SQL user submitting an unbounded-resource
+   query is **not** a Doris bug. Operators must constrain users via
+   the canonical knob set *(maintainer, M5)*: **`exec_mem_limit`**
+   (per-query memory cap), **Workload Group** (`CREATE WORKLOAD
+   GROUP ...` — recommended production posture for memory/CPU/
+   concurrency caps per user/group), and **`max_connections` /
+   `max_connection_per_user`** (FE config, prevent connection
+   exhaustion).
+4. **Side-channel / timing-based information disclosure**
+   *(maintainer, Q5)*.
+5. **Adversary-controlled external catalog data** *(maintainer, Q8)*.
+   Bytes returned from admin-connected Iceberg/Hive/Hudi/Paimon/
+   JDBC/S3 catalogs are trusted. Crashes from crafted Parquet/ORC/
+   Avro/JSON files are `OUT-OF-MODEL: trusted-input`.
+6. **`SUPER`-privileged adversary** *(maintainer, M3)*. `SUPER` (and
+   `ADMIN_PRIV` / equivalent) holders are trusted by definition.
+   RCE achievable only after acquiring `SUPER` — UDF install, JDBC
+   driver attach, FE plugin registration, `ADMIN SET CONFIG` —
+   `OUT-OF-MODEL: adversary-not-in-scope`.
+7. **Compromise of an external system Doris connects to**.
+   Downstream effects on Doris are out of model per (5).
+8. **Transport-layer confidentiality on default config**
+   *(maintainer, Q7)*. TLS off by default IS the supported production
+   posture; "credentials sniffable in default config" is `BY-DESIGN:
+   property-disclaimed` (§4.9).
+9. **Default ship of pre-auth login lockout** *(maintainer, M11)*.
+   Doris ships `numFailedLogin = 0` and `passwordLockSeconds = 0` —
+   the *mechanism* exists but is opt-in per user via `CREATE USER ...
+   FAILED_LOGIN_ATTEMPTS N PASSWORD_LOCK_TIME T`. "I brute-forced an
+   account in default config" is `BY-DESIGN: property-disclaimed`
+   plus an §4.10 obligation.
+10. **Byzantine cluster peers** *(maintainer, M6)*. BDB-JE FE
+    replication and tablet replication assume honest peers.
+11. **Co-tenant escape at the K8s / OS level** *(maintainer, M2 by
+    extension)*. M2 commits to per-tenant isolation enforced *inside
+    the Meta Service*; OS / K8s / hypervisor isolation between
+    co-tenant pods is the cloud operator's job, not Doris'.
+
+**Code shipped but not threat-modeled.** Family rows 9–12 (vendored
+MySQL, samples/dev/CI, all FE plugins including `auditloader`,
+SDK/extension/CDC) per §4.2. Reports landing there are `OUT-OF-MODEL:
+unsupported-component`.
+
+---
+
+## 4.4 Trust boundaries and data flow
+
+**Three concentric trust zones**, with a **tenant boundary inside
+Meta Service for cloud** *(maintainer, Q1, Q8, M2)*:
+
+```
+┌──────────────────────────────────────────────────────────────────────┐
+│  Zone-3  EXTERNAL  (Hive, Iceberg, Hudi, Paimon, JDBC, S3,           │
+│                    HDFS, Azure)                                       │
+│   ── trusted-by-admin-connection ──                                  │
+│   ┌──────────────────────────────────────────────────────────────┐  │
+│   │ Zone-2  CLUSTER-INTERNAL  (FE↔FE, FE↔BE, BE↔BE, FE↔Broker,   │  │
+│   │                            FE↔MetaService [cloud])            │  │
+│   │     ── trusted-by-network-isolation ──                        │  │
+│   │     ┌─────────────────────────────────────────────────┐      │  │
+│   │     │ Zone-1  CLUSTER-CORE PROCESS                     │      │  │
+│   │     │   FE JVM, BE C++ process,                        │      │  │
+│   │     │   Cloud Meta Service ←── enforces tenant        │      │  │
+│   │     │                          boundary T1│T2│T3      │      │  │
+│   │     └─────────────────────────────────────────────────┘      │  │
+│   └──────────────────────────────────────────────────────────────┘  │
+│                                                                      │
+│      ▲ THE BOUNDARY ▲                                               │
+│                                                                      │
+│  Zone-0  CLIENT  (untrusted MySQL/HTTP/Arrow-Flight clients)         │
+│          ── BE Arrow Flight 8050 lives here too (M7) ──              │
+└──────────────────────────────────────────────────────────────────────┘
+```
+
+**The single load-bearing trust transition is Zone-0 → Zone-1 at
+the client-facing ports.** All other transitions assume the source
+is trusted. **The cloud Meta Service additionally claims a
+*tenant boundary* inside Zone-1** — see §4.8 (NEW property).
+
+**Per-port reachability precondition.** A finding in code reachable
+*only* from Zone-2 or Zone-3 inputs is OUT-OF-MODEL by §4.3 (2) or
+(5). Triage applies this test before anything else.
+
+| Port | Protocol | Zone | Reachability precondition |
+|---|---|---|---|
+| FE 9030 | MySQL wire | 0→1 | bytes attacker-controlled before / during auth, 
or SQL post-auth |
+| FE 8030 | HTTP / REST | 0→1 | request bytes attacker-controlled |
+| FE 8070 | Arrow Flight (FE) | 0→1 | handshake / auth bytes 
attacker-controlled |
+| **BE 8050** | **Arrow Flight (BE, client-facing)** | **0→1** *(maintainer, 
M7)* | **handshake bytes / result-stream consumption attacker-controlled** |
+| FE 9020 | Thrift RPC (FE↔BE) | 2 | **none — out of model** |
+| FE 9010 | BDB JE edit log (FE↔Follower) | 2 | **none — out of model** |
+| BE 8060 | BRPC | 2 | **none — out of model** |
+| BE 8040 | HTTP webserver / metrics | 2 | **none — out of model** |
+| BE 9050 | Thrift heartbeat (FE→BE) | 2 | **none — out of model** |
+| BE 9060 | Thrift fragment exec (BE↔BE) | 2 | **none — out of model** |
+| Broker | Thrift | 2 | **none — out of model** |
+| Cloud Meta Service ↔ FE/BE | gRPC | 2 (network) + tenant-boundary (data) | a 
finding inside Meta Service is in-model if it crosses the per-tenant boundary |
+
+**Data flow from Zone-3 (external catalogs) into Zone-1.** Per §4.3
+(5) and §4.6, all bytes from external systems are admin-trusted.
+They flow into BE format readers (Parquet, ORC, Avro, JSON, CSV) and
+FE catalog metadata (HMS tables, Iceberg manifests). Crafted-byte
+crashes are OUT-OF-MODEL.
+
+---
+
+## 4.5 Assumptions about the environment
+
+**Supported toolchain / platform** *(maintainer, M8)*:
+- **OS**: Linux x86_64 and Linux aarch64 (both first-class).
+- **FE runtime**: JDK 17.
+- **BE toolchain**: GCC 11+ libstdc++ (or equivalent conformant C++
+  toolchain matching the official docker build image).
+- Anything else (different JDK, non-conformant C++ toolchain) is
+  `OUT-OF-MODEL: non-default-build`.
+
+Operational assumptions:
+- **Allocator**: BE defaults to jemalloc.
+- **Concurrency**: BE assumes a conformant C++ memory model with
+  atomic intrinsics; FE assumes the JMM. Neither is signal-safe.
+- **Filesystem**: BE assumes ownership of `storage_root_path`
+  directories; concurrent external mutation is undefined.
+- **Network**: cluster-internal network is treated as a security

Review Comment:
   The runtime environment-variable inventory is too narrow for the current 
code. For example, BE reads additional hard-coded variables such as 
`DORIS_CLASSPATH` in `be/src/util/jni-util.cpp`, `SKIP_CHECK_ULIMIT` in 
`be/src/storage/storage_engine.cpp`, and `AI_TEST_RESULT` in 
`be/src/exprs/function/ai/ai_functions.h`; FE deployment code also reads 
deployment-specific env names in 
`fe/fe-core/src/main/java/org/apache/doris/deploy/DeployManager.java`. More 
importantly, both FE `ConfigBase.replacedByEnv()` and BE `config::replaceenv()` 
expand arbitrary `${...}` names from config files via `System.getenv()` / 
`std::getenv()`. Because this statement is part of the security contract, 
triagers may incorrectly close env-based reports as out-of-model. Please either 
include these paths/variables or rephrase this as a non-exhaustive list and 
explicitly mention config-driven env expansion.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] [doc](security) Add threat model for scans [doris]

Reply via email to