This is an automated email from the ASF dual-hosted git repository.
morningman pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/doris-website.git
The following commit(s) were added to refs/heads/master by this push:
new 2e02ca0a96c [fix](doc) clean up table-design, vector index, lakehouse,
and view docs (#3718)
2e02ca0a96c is described below
commit 2e02ca0a96c8d0883f38d7ece4c0d8b666b88df1
Author: boluor <[email protected]>
AuthorDate: Thu May 21 22:36:05 2026 -0700
[fix](doc) clean up table-design, vector index, lakehouse, and view docs
(#3718)
## Summary
A pass of small correctness fixes across `docs/`,
`versioned_docs/version-{2.1,3.x,4.x}/`, and the matching `i18n/zh-CN/`
files.
### `sql-manual/.../view/SHOW-VIEW.md`
- Replace the informal `grammar:` label with a proper `## Syntax`
heading.
- Drop the empty trailing `## Best Practice` section.
### `table-design/data-type.md` — DECIMAL row
The cell mixed two bugs:
- It closed the `<ul>` with another opening `<ul>` instead of `</ul>`,
so the list never closed in HTML.
- The third bullet read `When 16 < precision <= 38` (overlapping with
the previous bullet). It should be `When 18 < precision <= 38`, matching
the boundaries used in the `version-3.x` / `version-2.1` docs.
Also close each `<li>` explicitly to match the corrected `</ul>`.
### `table-design/index/prefix-index.md`
Frontmatter `keywords` listed `Prefix Index` and `Sort Key` twice each —
dedupe.
### `table-design/index/ngram-bloomfilter-index.md`
The example `SELECT` projects `any(product_title)`, but the result-table
headers shown immediately below are `any_value(product_title)`. Rewrite
the SELECT to `any_value(product_title)` so the displayed result
actually matches the query a reader would copy.
### `table-design/index/inverted-index/custom-analyzer.md`
- Strip a stray `</content></invoke>` block at the bottom of the file
(looks like residue from a previous edit) — `version-4.x` and `current`
were affected.
- Dedupe `custom analyzer` in the frontmatter `keywords`.
- Rename `min_ngram` / `max_ngram` to `min_gram` / `max_gram` in the
parameter table and the bulleted parameter descriptions. Every working
example in the same file, and the underlying tokenizer parameters Doris
uses, are `min_gram` / `max_gram`.
### `table-design/index/vector-index/hnsw.md`
The prose said _"…takes about 1.2x the memory…"_ while the formula on
the very next line, and the example calculation that arrives at 650 MB,
both use 1.3x. Update the prose to 1.3x.
### `table-design/index/vector-index/ivf.md`
For 128-dim / 1M rows, the formula above the table evaluates to "≈ 500
MB" while the reference-values table directly below says 496 MB. Align
the table cell to 500 MB so the two presentations agree.
### `table-design/index/vector-index/overview.md`
The Prepared Statement example used `FROM l2_distance_approximate` —
that's the scalar function being projected, not a table. The intended
table (used by all surrounding examples in this file) is `sift_1M`.
### `data-operate/import/data-source/bigquery.md`
`https://cloud.google.com/bigquerydocs/exporting-data` is a 404. The
correct path is `https://cloud.google.com/bigquery/docs/exporting-data`.
### `lakehouse/catalogs/jdbc-catalog-overview.md`
The Maven Central example URL was given over plain HTTP
(`http://repo1.maven.org/...`). `repo1.maven.org` redirects/enforces
HTTPS, so the example fails as-shown — change to `https://`.
## Test plan
- [x] Apply each substitution to the equivalent files in `docs/`,
`versioned_docs/`, and `i18n/zh-CN/` where the same content exists.
- [x] Spot-check rendered diffs for each finding.
- [ ] CI build (docusaurus + sidebar checks).
---------
Co-authored-by: Claude Opus 4.7 (1M context) <[email protected]>
---
docs/data-operate/import/data-source/bigquery.md | 2 +-
docs/lakehouse/catalogs/jdbc-catalog-overview.md | 2 +-
.../sql-statements/table-and-view/view/SHOW-VIEW.md | 4 +---
docs/table-design/data-type.md | 2 +-
docs/table-design/index/inverted-index/custom-analyzer.md | 11 ++++-------
docs/table-design/index/ngram-bloomfilter-index.md | 2 +-
docs/table-design/index/prefix-index.md | 2 --
docs/table-design/index/vector-index/hnsw.md | 2 +-
docs/table-design/index/vector-index/ivf.md | 2 +-
docs/table-design/index/vector-index/overview.md | 2 +-
.../current/data-operate/import/data-source/bigquery.md | 2 +-
.../current/lakehouse/catalogs/jdbc-catalog-overview.md | 2 +-
.../current/table-design/data-type.md | 2 +-
.../table-design/index/inverted-index/custom-analyzer.md | 8 ++++----
.../current/table-design/index/ngram-bloomfilter-index.md | 2 +-
.../current/table-design/index/vector-index/hnsw.md | 2 +-
.../current/table-design/index/vector-index/ivf.md | 2 +-
.../current/table-design/index/vector-index/overview.md | 2 +-
.../version-2.1/data-operate/import/data-source/bigquery.md | 2 +-
.../version-2.1/lakehouse/catalogs/jdbc-catalog-overview.md | 2 +-
.../version-2.1/table-design/data-type.md | 2 +-
.../version-2.1/table-design/index/ngram-bloomfilter-index.md | 2 +-
.../version-3.x/data-operate/import/data-source/bigquery.md | 2 +-
.../version-3.x/lakehouse/catalogs/jdbc-catalog-overview.md | 2 +-
.../version-3.x/table-design/data-type.md | 2 +-
.../version-3.x/table-design/index/ngram-bloomfilter-index.md | 2 +-
.../version-4.x/data-operate/import/data-source/bigquery.md | 2 +-
.../version-4.x/lakehouse/catalogs/jdbc-catalog-overview.md | 2 +-
.../version-4.x/table-design/data-type.md | 2 +-
.../table-design/index/inverted-index/custom-analyzer.md | 8 ++++----
.../version-4.x/table-design/index/ngram-bloomfilter-index.md | 2 +-
.../version-4.x/table-design/index/vector-index/hnsw.md | 2 +-
.../version-4.x/table-design/index/vector-index/ivf.md | 2 +-
.../version-4.x/table-design/index/vector-index/overview.md | 2 +-
.../version-2.1/lakehouse/catalogs/jdbc-catalog-overview.md | 2 +-
.../sql-statements/table-and-view/view/SHOW-VIEW.md | 2 +-
.../version-2.1/table-design/index/ngram-bloomfilter-index.md | 2 +-
.../version-3.x/lakehouse/catalogs/jdbc-catalog-overview.md | 2 +-
.../sql-statements/table-and-view/view/SHOW-VIEW.md | 4 +---
.../version-3.x/table-design/index/ngram-bloomfilter-index.md | 2 +-
.../version-4.x/data-operate/import/data-source/bigquery.md | 2 +-
.../version-4.x/lakehouse/catalogs/jdbc-catalog-overview.md | 2 +-
.../sql-statements/table-and-view/view/SHOW-VIEW.md | 4 +---
versioned_docs/version-4.x/table-design/data-type.md | 2 +-
.../table-design/index/inverted-index/custom-analyzer.md | 11 ++++-------
.../version-4.x/table-design/index/ngram-bloomfilter-index.md | 2 +-
versioned_docs/version-4.x/table-design/index/prefix-index.md | 2 --
.../version-4.x/table-design/index/vector-index/hnsw.md | 2 +-
.../version-4.x/table-design/index/vector-index/ivf.md | 2 +-
.../version-4.x/table-design/index/vector-index/overview.md | 2 +-
50 files changed, 60 insertions(+), 76 deletions(-)
diff --git a/docs/data-operate/import/data-source/bigquery.md
b/docs/data-operate/import/data-source/bigquery.md
index 1f1a10b6b0a..d7220d9c24c 100644
--- a/docs/data-operate/import/data-source/bigquery.md
+++ b/docs/data-operate/import/data-source/bigquery.md
@@ -125,7 +125,7 @@ AS (
#### 2.2 Inspect the exported files on GCS
-The command above exports `sales_data` to GCS. Each partition produces one or
more files with incrementing file names. For details, see
[exporting-data](https://cloud.google.com/bigquerydocs/exporting-data#exporting_data_into_one_or_more_files).
+The command above exports `sales_data` to GCS. Each partition produces one or
more files with incrementing file names. For details, see
[exporting-data](https://cloud.google.com/bigquery/docs/exporting-data#exporting_data_into_one_or_more_files).

diff --git a/docs/lakehouse/catalogs/jdbc-catalog-overview.md
b/docs/lakehouse/catalogs/jdbc-catalog-overview.md
index faf147387cc..614408b718d 100644
--- a/docs/lakehouse/catalogs/jdbc-catalog-overview.md
+++ b/docs/lakehouse/catalogs/jdbc-catalog-overview.md
@@ -65,7 +65,7 @@ CREATE CATALOG [IF NOT EXISTS] catalog_name PROPERTIES (
2. Local absolute path. For example,
`file:///path/to/mysql-connector-j-8.3.0.jar`. The Jar file must be pre-placed
in the specified path on all FE/BE nodes.
- 3. HTTP URL. For example:
`http://repo1.maven.org/maven2/com/mysql/mysql-connector-j/8.3.0/mysql-connector-j-8.3.0.jar`.
The system will download the driver file from this HTTP address. Only supports
HTTP services without authentication.
+ 3. HTTP URL. For example:
`https://repo1.maven.org/maven2/com/mysql/mysql-connector-j/8.3.0/mysql-connector-j-8.3.0.jar`.
The system will download the driver file from this HTTP address. Only supports
HTTP services without authentication.
* Optional Properties
diff --git a/docs/sql-manual/sql-statements/table-and-view/view/SHOW-VIEW.md
b/docs/sql-manual/sql-statements/table-and-view/view/SHOW-VIEW.md
index 312fee6ff54..259700e46b3 100644
--- a/docs/sql-manual/sql-statements/table-and-view/view/SHOW-VIEW.md
+++ b/docs/sql-manual/sql-statements/table-and-view/view/SHOW-VIEW.md
@@ -10,7 +10,7 @@
This statement is used to display all views based on the given table
-grammar:
+## Syntax
```sql
SHOW VIEW { FROM | IN } table [ FROM db ]
@@ -28,5 +28,3 @@ grammar:
SHOW, VIEW
-## Best Practice
-
diff --git a/docs/table-design/data-type.md b/docs/table-design/data-type.md
index e634e5876d4..544fa9d81b1 100644
--- a/docs/table-design/data-type.md
+++ b/docs/table-design/data-type.md
@@ -21,7 +21,7 @@ The list of data types supported by Apache Doris is as
follows:
| [LARGEINT](../sql-manual/basic-element/sql-data-types/numeric/LARGEINT)
| 16 | Signed integer, range [-2^127 + 1 ~ 2^127 - 1]. |
| [FLOAT](../sql-manual/basic-element/sql-data-types/numeric/FLOATING-POINT)
| 4 | Floating-point number, range [-3.4*10^38 ~ 3.4*10^38].
|
| [DOUBLE](../sql-manual/basic-element/sql-data-types/numeric/FLOATING-POINT)
| 8 | Floating-point number, range [-1.79*10^308 ~ 1.79*10^308].
|
-| [DECIMAL](../sql-manual/basic-element/sql-data-types/numeric/DECIMAL)
| 4/8/16/32 | High-precision fixed-point number. Format: DECIMAL(P[,S]). P
represents the total number of significant digits (precision), and S represents
the number of digits after the decimal point (scale). The range of P is [1,
MAX_P]. When `enable_decimal256`=false, MAX_P=38; when
`enable_decimal256`=true, MAX_P=76. The range of S is [0, P].<br>The default
value of `enable_decimal256` is false. Setting [...]
+| [DECIMAL](../sql-manual/basic-element/sql-data-types/numeric/DECIMAL)
| 4/8/16/32 | High-precision fixed-point number. Format: DECIMAL(P[,S]). P
represents the total number of significant digits (precision), and S represents
the number of digits after the decimal point (scale). The range of P is [1,
MAX_P]. When `enable_decimal256`=false, MAX_P=38; when
`enable_decimal256`=true, MAX_P=76. The range of S is [0, P].<br>The default
value of `enable_decimal256` is false. Setting [...]
### [Date
Types](../sql-manual/basic-element/sql-data-types/data-type-overview#date-types)
diff --git a/docs/table-design/index/inverted-index/custom-analyzer.md
b/docs/table-design/index/inverted-index/custom-analyzer.md
index 3efbc55b36b..3b901de8cae 100644
--- a/docs/table-design/index/inverted-index/custom-analyzer.md
+++ b/docs/table-design/index/inverted-index/custom-analyzer.md
@@ -4,7 +4,6 @@
"language": "en",
"description": "Doris custom analyzers combine character filters,
tokenizers, and token filters to flexibly control text segmentation strategies,
improving the search relevance and precision of inverted indexes.",
"keywords": [
- "custom analyzer",
"custom analyzer",
"inverted index tokenizer",
"tokenizer",
@@ -91,8 +90,8 @@ Supported tokenizer types:
| Type | Description | Main Parameters |
| --- | --- | --- |
| `standard` | Standard tokenization (follows Unicode text segmentation),
suitable for most languages | None |
-| `ngram` | Splits by N-grams | `min_ngram`, `max_ngram`, `token_chars` |
-| `edge_ngram` | Generates N-grams starting from the beginning of the word |
`min_ngram`, `max_ngram`, `token_chars` |
+| `ngram` | Splits by N-grams | `min_gram`, `max_gram`, `token_chars` |
+| `edge_ngram` | Generates N-grams starting from the beginning of the word |
`min_gram`, `max_gram`, `token_chars` |
| `keyword` | Outputs the entire text as a single term, often combined with
token_filter | None |
| `char_group` | Splits by the given characters | `tokenize_on_chars` |
| `basic` | Simple English / digit / Chinese / Unicode tokenization |
`extra_chars` |
@@ -100,8 +99,8 @@ Supported tokenizer types:
Parameter descriptions:
-- `min_ngram`: minimum length (default 1)
-- `max_ngram`: maximum length (default 2)
+- `min_gram`: minimum length (default 1)
+- `max_gram`: maximum length (default 2)
- `token_chars`: character categories to keep (default: keep all). Options:
`letter`, `digit`, `whitespace`, `punctuation`, `symbol`
- `tokenize_on_chars`: a character list or category. Categories support
`whitespace`, `letter`, `digit`, `punctuation`, `symbol`, `cjk`
- `extra_chars`: additional ASCII characters to split on (such as `[]().`)
@@ -503,5 +502,3 @@ Result:
1. Nesting multiple components in a custom `analyzer` may degrade tokenization
performance.
2. The `select tokenize` tokenization function supports custom analyzers and
can be used to debug tokenization results.
3. Only one of the predefined `built_in_analyzer` and a custom `analyzer` can
exist on the same index.
-</content>
-</invoke>
\ No newline at end of file
diff --git a/docs/table-design/index/ngram-bloomfilter-index.md
b/docs/table-design/index/ngram-bloomfilter-index.md
index c30319a6fc5..844b3bf1001 100644
--- a/docs/table-design/index/ngram-bloomfilter-index.md
+++ b/docs/table-design/index/ngram-bloomfilter-index.md
@@ -239,7 +239,7 @@ mysql> SELECT COUNT() FROM amazon_reviews;
```sql
SELECT
product_id,
- any(product_title),
+ any_value(product_title),
AVG(star_rating) AS rating,
COUNT() AS count
FROM
diff --git a/docs/table-design/index/prefix-index.md
b/docs/table-design/index/prefix-index.md
index e9dcb06a354..9d86959ea02 100644
--- a/docs/table-design/index/prefix-index.md
+++ b/docs/table-design/index/prefix-index.md
@@ -6,8 +6,6 @@
"keywords": [
"Prefix Index",
"Sort Key",
- "Sort Key",
- "Prefix Index",
"Apache Doris",
"sparse index",
"query acceleration",
diff --git a/docs/table-design/index/vector-index/hnsw.md
b/docs/table-design/index/vector-index/hnsw.md
index b1ec0dd0823..8dbad1d57ad 100644
--- a/docs/table-design/index/vector-index/hnsw.md
+++ b/docs/table-design/index/vector-index/hnsw.md
@@ -418,7 +418,7 @@ Before high-concurrency queries, run a cold query first to
warm up the index fil
#### Memory Footprint and Performance
-> **An HNSW index (without quantization compression) takes about 1.2x the
memory of the vectors it indexes.**
+> **An HNSW index (without quantization compression) takes about 1.3x the
memory of the vectors it indexes.**
For example, for a 128-dimensional, 1M dataset, an HNSW FLAT index needs about
`128 x 4 x 1,000,000 x 1.3 ~= 650 MB`.
diff --git a/docs/table-design/index/vector-index/ivf.md
b/docs/table-design/index/vector-index/ivf.md
index 78050cde077..d0ab6383c71 100644
--- a/docs/table-design/index/vector-index/ivf.md
+++ b/docs/table-design/index/vector-index/ivf.md
@@ -395,7 +395,7 @@ Reference values:
| dim | rows | Estimated memory |
| --- | --- | --- |
-| 128 | 1M | 496 MB |
+| 128 | 1M | 500 MB |
| 768 | 1M | 2.9 GB |
To guarantee query performance, the BE must have enough memory to hold the
entire index. Otherwise, frequent IO on index files causes severe query
performance degradation.
diff --git a/docs/table-design/index/vector-index/overview.md
b/docs/table-design/index/vector-index/overview.md
index 7797e7a873d..9fe2be69766 100644
--- a/docs/table-design/index/vector-index/overview.md
+++ b/docs/table-design/index/vector-index/overview.md
@@ -478,7 +478,7 @@ Common embedding-model outputs are typically 768 dimensions
or higher. If you em
```java
// use `?` for placement holders, readStatement should be reused
PreparedStatement readStatement = conn.prepareStatement("SELECT id,
l2_distance_approximate(embedding, cast (? as ARRAY<FLOAT>)) AS distance
- FROM l2_distance_approximate
+ FROM sift_1M
ORDER BY distance
LIMIT 10");
diff --git
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/data-operate/import/data-source/bigquery.md
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/data-operate/import/data-source/bigquery.md
index b908ca53532..8a862fd9b2e 100644
---
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/data-operate/import/data-source/bigquery.md
+++
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/data-operate/import/data-source/bigquery.md
@@ -125,7 +125,7 @@ AS (
#### 2.2 查看 GCS 上的导出文件
-以上命令会将 `sales_data` 的数据导出到 GCS,每个分区会产生一个或多个文件,文件名递增。具体规则可参考
[exporting-data](https://cloud.google.com/bigquerydocs/exporting-data#exporting_data_into_one_or_more_files)。
+以上命令会将 `sales_data` 的数据导出到 GCS,每个分区会产生一个或多个文件,文件名递增。具体规则可参考
[exporting-data](https://cloud.google.com/bigquery/docs/exporting-data#exporting_data_into_one_or_more_files)。

diff --git
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/lakehouse/catalogs/jdbc-catalog-overview.md
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/lakehouse/catalogs/jdbc-catalog-overview.md
index fdf6c7d3cef..6dae931b15f 100644
---
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/lakehouse/catalogs/jdbc-catalog-overview.md
+++
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/lakehouse/catalogs/jdbc-catalog-overview.md
@@ -65,7 +65,7 @@ CREATE CATALOG [IF NOT EXISTS] catalog_name PROPERTIES (
2. 本地绝对路径。如 `file:///path/to/mysql-connector-j-8.3.0.jar`。需将 Jar
包预先存放在所有 FE/BE 节点指定的路径下。
- 3. Http
地址。如:`http://repo1.maven.org/maven2/com/mysql/mysql-connector-j/8.3.0/mysql-connector-j-8.3.0.jar`
系统会从这个 Http 地址下载 Driver 文件。仅支持无认证的 Http 服务。
+ 3. Http
地址。如:`https://repo1.maven.org/maven2/com/mysql/mysql-connector-j/8.3.0/mysql-connector-j-8.3.0.jar`
系统会从这个 Http 地址下载 Driver 文件。仅支持无认证的 Http 服务。
* 可选属性
diff --git
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/table-design/data-type.md
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/table-design/data-type.md
index 714216786d0..acef5f85849 100644
---
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/table-design/data-type.md
+++
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/table-design/data-type.md
@@ -21,7 +21,7 @@ Apache Doris 已支持的数据类型列表如下:
| [LARGEINT](../sql-manual/basic-element/sql-data-types/numeric/LARGEINT)
| 16 | 有符号整数,范围 [-2^127 + 1 ~ 2^127 - 1]。 |
| [FLOAT](../sql-manual/basic-element/sql-data-types/numeric/FLOATING-POINT)
| 4 | 浮点数,范围 [-3.4*10^38 ~ 3.4*10^38]。 |
| [DOUBLE](../sql-manual/basic-element/sql-data-types/numeric/FLOATING-POINT)
| 8 | 浮点数,范围 [-1.79*10^308 ~ 1.79*10^308]。 |
-| [DECIMAL](../sql-manual/basic-element/sql-data-types/numeric/DECIMAL)
| 4/8/16/32 | 高精度定点数,格式:DECIMAL(P[,S])。其中,P 代表一共有多少个有效数字(precision),S
代表小数位有多少数字(scale)。有效数字 P 的范围是 [1, MAX_P],`enable_decimal256`=false
时,MAX_P=38,`enable_decimal256`=true 时,MAX_P=76。小数位数字数量 S 的范围是 [0,
P]。<br>`enable_decimal256` 的默认值是 false,设置为 true
可以获得更加精确的结果,但是会带来一些性能损失。<br>存储空间:<ul><li>0 < precision <= 9 时,占用 4 字节。<li>9 <
precision <= 18 时,占用 8 字节。<li>16 < precision <= 38 时,占用 16 字节。<li>38 <
precision [...]
+| [DECIMAL](../sql-manual/basic-element/sql-data-types/numeric/DECIMAL)
| 4/8/16/32 | 高精度定点数,格式:DECIMAL(P[,S])。其中,P 代表一共有多少个有效数字(precision),S
代表小数位有多少数字(scale)。有效数字 P 的范围是 [1, MAX_P],`enable_decimal256`=false
时,MAX_P=38,`enable_decimal256`=true 时,MAX_P=76。小数位数字数量 S 的范围是 [0,
P]。<br>`enable_decimal256` 的默认值是 false,设置为 true
可以获得更加精确的结果,但是会带来一些性能损失。<br>存储空间:<ul><li>0 < precision <= 9 时,占用 4
字节。</li><li>9 < precision <= 18 时,占用 8 字节。</li><li>18 < precision <= 38 时,占用 16
字节。</li><li> [...]
### [日期类型](../sql-manual/basic-element/sql-data-types/data-type-overview#日期类型)
diff --git
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/table-design/index/inverted-index/custom-analyzer.md
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/table-design/index/inverted-index/custom-analyzer.md
index 3c1cde7f49c..803ec96eed9 100644
---
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/table-design/index/inverted-index/custom-analyzer.md
+++
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/table-design/index/inverted-index/custom-analyzer.md
@@ -91,8 +91,8 @@ PROPERTIES (
| 类型 | 说明 | 主要参数 |
| --- | --- | --- |
| `standard` | 标准分词(遵循 Unicode 文本分割),适用于多数语言 | 无 |
-| `ngram` | 按 N 元组切分 | `min_ngram`、`max_ngram`、`token_chars` |
-| `edge_ngram` | 从词首起始位置生成 N 元组 | `min_ngram`、`max_ngram`、`token_chars` |
+| `ngram` | 按 N 元组切分 | `min_gram`、`max_gram`、`token_chars` |
+| `edge_ngram` | 从词首起始位置生成 N 元组 | `min_gram`、`max_gram`、`token_chars` |
| `keyword` | 整段文本作为一个词项输出,常与 token_filter 组合 | 无 |
| `char_group` | 按给定字符切分 | `tokenize_on_chars` |
| `basic` | 简单英文 / 数字 / 中文 / Unicode 分词 | `extra_chars` |
@@ -100,8 +100,8 @@ PROPERTIES (
参数说明:
-- `min_ngram`:最小长度(默认 1)
-- `max_ngram`:最大长度(默认 2)
+- `min_gram`:最小长度(默认 1)
+- `max_gram`:最大长度(默认 2)
-
`token_chars`:保留字符类别(默认保留全部)。可选:`letter`、`digit`、`whitespace`、`punctuation`、`symbol`
- `tokenize_on_chars`:字符列表或类别,类别支持
`whitespace`、`letter`、`digit`、`punctuation`、`symbol`、`cjk`
- `extra_chars`:额外分割的 ASCII 字符(如 `[]().`)
diff --git
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/table-design/index/ngram-bloomfilter-index.md
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/table-design/index/ngram-bloomfilter-index.md
index 784979c65f5..aefbe1b2d0e 100644
---
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/table-design/index/ngram-bloomfilter-index.md
+++
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/table-design/index/ngram-bloomfilter-index.md
@@ -239,7 +239,7 @@ mysql> SELECT COUNT() FROM amazon_reviews;
```sql
SELECT
product_id,
- any(product_title),
+ any_value(product_title),
AVG(star_rating) AS rating,
COUNT() AS count
FROM
diff --git
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/table-design/index/vector-index/hnsw.md
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/table-design/index/vector-index/hnsw.md
index 05ddb68fef0..bc941bb2504 100644
---
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/table-design/index/vector-index/hnsw.md
+++
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/table-design/index/vector-index/hnsw.md
@@ -418,7 +418,7 @@ Doris 的 ANN 索引基于 Meta 开源的
[faiss](https://github.com/facebookres
#### 内存空间与性能
-> **HNSW 索引(无量化压缩)占用的内存空间约为其检索向量内存大小的 1.2 倍。**
+> **HNSW 索引(无量化压缩)占用的内存空间约为其检索向量内存大小的 1.3 倍。**
例如 128 维、1M 数据集,HNSW FLAT 索引大约需要 `128 × 4 × 1,000,000 × 1.3 ≈ 650 MB`。
diff --git
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/table-design/index/vector-index/ivf.md
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/table-design/index/vector-index/ivf.md
index 37afe8af615..fdccc710528 100644
---
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/table-design/index/vector-index/ivf.md
+++
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/table-design/index/vector-index/ivf.md
@@ -396,7 +396,7 @@ Doris 的 ANN 索引基于 Meta 开源的
[faiss](https://github.com/facebookres
| dim | rows | 预估内存 |
| --- | --- | --- |
-| 128 | 1M | 496 MB |
+| 128 | 1M | 500 MB |
| 768 | 1M | 2.9 GB |
为保证查询性能,BE 必须有足够的内存容纳全部索引;否则索引文件频繁 IO 会导致查询性能大幅衰减。
diff --git
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/table-design/index/vector-index/overview.md
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/table-design/index/vector-index/overview.md
index d3a91b99311..844c92cb69a 100644
---
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/table-design/index/vector-index/overview.md
+++
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/table-design/index/vector-index/overview.md
@@ -478,7 +478,7 @@ PROPERTIES (
```java
// use `?` for placement holders, readStatement should be reused
PreparedStatement readStatement = conn.prepareStatement("SELECT id,
l2_distance_approximate(embedding, cast (? as ARRAY<FLOAT>)) AS distance
- FROM l2_distance_approximate
+ FROM sift_1M
ORDER BY distance
LIMIT 10");
diff --git
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/data-operate/import/data-source/bigquery.md
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/data-operate/import/data-source/bigquery.md
index af3b4d95421..e324f9b5b59 100644
---
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/data-operate/import/data-source/bigquery.md
+++
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/data-operate/import/data-source/bigquery.md
@@ -99,7 +99,7 @@ PROPERTIES (
2.2. **查看 GCS 上的导出文件**
- 以上命令会将 sales_data 的数据导出到 GCS
上,并且每个分区会产生一个或多个文件,文件名递增,具体可参考[exporting-data](https://cloud.google.com/bigquerydocs/exporting-data#exporting_data_into_one_or_more_files),如下
+ 以上命令会将 sales_data 的数据导出到 GCS
上,并且每个分区会产生一个或多个文件,文件名递增,具体可参考[exporting-data](https://cloud.google.com/bigquery/docs/exporting-data#exporting_data_into_one_or_more_files),如下

## 3. 导入数据到 Doris
diff --git
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/lakehouse/catalogs/jdbc-catalog-overview.md
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/lakehouse/catalogs/jdbc-catalog-overview.md
index fdf6c7d3cef..6dae931b15f 100644
---
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/lakehouse/catalogs/jdbc-catalog-overview.md
+++
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/lakehouse/catalogs/jdbc-catalog-overview.md
@@ -65,7 +65,7 @@ CREATE CATALOG [IF NOT EXISTS] catalog_name PROPERTIES (
2. 本地绝对路径。如 `file:///path/to/mysql-connector-j-8.3.0.jar`。需将 Jar
包预先存放在所有 FE/BE 节点指定的路径下。
- 3. Http
地址。如:`http://repo1.maven.org/maven2/com/mysql/mysql-connector-j/8.3.0/mysql-connector-j-8.3.0.jar`
系统会从这个 Http 地址下载 Driver 文件。仅支持无认证的 Http 服务。
+ 3. Http
地址。如:`https://repo1.maven.org/maven2/com/mysql/mysql-connector-j/8.3.0/mysql-connector-j-8.3.0.jar`
系统会从这个 Http 地址下载 Driver 文件。仅支持无认证的 Http 服务。
* 可选属性
diff --git
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/table-design/data-type.md
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/table-design/data-type.md
index 36a9a14c7f5..a64741a99c3 100644
---
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/table-design/data-type.md
+++
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/table-design/data-type.md
@@ -21,7 +21,7 @@ Apache Doris 已支持的数据类型列表如下:
| [LARGEINT](../sql-manual/basic-element/sql-data-types/numeric/LARGEINT)
| 16 | 有符号整数,范围 [-2^127 + 1 ~ 2^127 - 1]。 |
| [FLOAT](../sql-manual/basic-element/sql-data-types/numeric/FLOAT) |
4 | 浮点数,范围 [-3.4*10^38 ~ 3.4*10^38]。 |
| [DOUBLE](../sql-manual/basic-element/sql-data-types/numeric/DOUBLE)
| 8 | 浮点数,范围 [-1.79*10^308 ~ 1.79*10^308]。 |
-| [DECIMAL](../sql-manual/basic-element/sql-data-types/numeric/DECIMAL)
| 4/8/16/32 | 高精度定点数,格式:DECIMAL(P[,S])。其中,P 代表一共有多少个有效数字(precision),S
代表小数位有多少数字(scale)。有效数字 P 的范围是 [1, MAX_P],`enable_decimal256`=false
时,MAX_P=38,`enable_decimal256`=true 时,MAX_P=76。小数位数字数量 S 的范围是 [0,
P]。<br>`enable_decimal256` 的默认值是 false,设置为 true
可以获得更加精确的结果,但是会带来一些性能损失。<br>存储空间:<ul><li>0 < precision <= 9 时,占用 4 字节。<li>9 <
precision <= 18 时,占用 8 字节。<li>16 < precision <= 38 时,占用 16 字节。<li>38 <
precision [...]
+| [DECIMAL](../sql-manual/basic-element/sql-data-types/numeric/DECIMAL)
| 4/8/16/32 | 高精度定点数,格式:DECIMAL(P[,S])。其中,P 代表一共有多少个有效数字(precision),S
代表小数位有多少数字(scale)。有效数字 P 的范围是 [1, MAX_P],`enable_decimal256`=false
时,MAX_P=38,`enable_decimal256`=true 时,MAX_P=76。小数位数字数量 S 的范围是 [0,
P]。<br>`enable_decimal256` 的默认值是 false,设置为 true
可以获得更加精确的结果,但是会带来一些性能损失。<br>存储空间:<ul><li>0 < precision <= 9 时,占用 4
字节。</li><li>9 < precision <= 18 时,占用 8 字节。</li><li>18 < precision <= 38 时,占用 16
字节。</li><li> [...]
### [日期类型](../sql-manual/basic-element/sql-data-types/data-type-overview#日期类型)
diff --git
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/table-design/index/ngram-bloomfilter-index.md
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/table-design/index/ngram-bloomfilter-index.md
index 8723ea99291..b87f8d4e7e0 100644
---
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/table-design/index/ngram-bloomfilter-index.md
+++
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/table-design/index/ngram-bloomfilter-index.md
@@ -178,7 +178,7 @@ mysql> SELECT COUNT() FROM amazon_reviews;
```
SELECT
product_id,
- any(product_title),
+ any_value(product_title),
AVG(star_rating) AS rating,
COUNT() AS count
FROM
diff --git
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.x/data-operate/import/data-source/bigquery.md
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.x/data-operate/import/data-source/bigquery.md
index af3b4d95421..e324f9b5b59 100644
---
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.x/data-operate/import/data-source/bigquery.md
+++
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.x/data-operate/import/data-source/bigquery.md
@@ -99,7 +99,7 @@ PROPERTIES (
2.2. **查看 GCS 上的导出文件**
- 以上命令会将 sales_data 的数据导出到 GCS
上,并且每个分区会产生一个或多个文件,文件名递增,具体可参考[exporting-data](https://cloud.google.com/bigquerydocs/exporting-data#exporting_data_into_one_or_more_files),如下
+ 以上命令会将 sales_data 的数据导出到 GCS
上,并且每个分区会产生一个或多个文件,文件名递增,具体可参考[exporting-data](https://cloud.google.com/bigquery/docs/exporting-data#exporting_data_into_one_or_more_files),如下

## 3. 导入数据到 Doris
diff --git
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.x/lakehouse/catalogs/jdbc-catalog-overview.md
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.x/lakehouse/catalogs/jdbc-catalog-overview.md
index fdf6c7d3cef..6dae931b15f 100644
---
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.x/lakehouse/catalogs/jdbc-catalog-overview.md
+++
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.x/lakehouse/catalogs/jdbc-catalog-overview.md
@@ -65,7 +65,7 @@ CREATE CATALOG [IF NOT EXISTS] catalog_name PROPERTIES (
2. 本地绝对路径。如 `file:///path/to/mysql-connector-j-8.3.0.jar`。需将 Jar
包预先存放在所有 FE/BE 节点指定的路径下。
- 3. Http
地址。如:`http://repo1.maven.org/maven2/com/mysql/mysql-connector-j/8.3.0/mysql-connector-j-8.3.0.jar`
系统会从这个 Http 地址下载 Driver 文件。仅支持无认证的 Http 服务。
+ 3. Http
地址。如:`https://repo1.maven.org/maven2/com/mysql/mysql-connector-j/8.3.0/mysql-connector-j-8.3.0.jar`
系统会从这个 Http 地址下载 Driver 文件。仅支持无认证的 Http 服务。
* 可选属性
diff --git
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.x/table-design/data-type.md
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.x/table-design/data-type.md
index 36a9a14c7f5..a64741a99c3 100644
---
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.x/table-design/data-type.md
+++
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.x/table-design/data-type.md
@@ -21,7 +21,7 @@ Apache Doris 已支持的数据类型列表如下:
| [LARGEINT](../sql-manual/basic-element/sql-data-types/numeric/LARGEINT)
| 16 | 有符号整数,范围 [-2^127 + 1 ~ 2^127 - 1]。 |
| [FLOAT](../sql-manual/basic-element/sql-data-types/numeric/FLOAT) |
4 | 浮点数,范围 [-3.4*10^38 ~ 3.4*10^38]。 |
| [DOUBLE](../sql-manual/basic-element/sql-data-types/numeric/DOUBLE)
| 8 | 浮点数,范围 [-1.79*10^308 ~ 1.79*10^308]。 |
-| [DECIMAL](../sql-manual/basic-element/sql-data-types/numeric/DECIMAL)
| 4/8/16/32 | 高精度定点数,格式:DECIMAL(P[,S])。其中,P 代表一共有多少个有效数字(precision),S
代表小数位有多少数字(scale)。有效数字 P 的范围是 [1, MAX_P],`enable_decimal256`=false
时,MAX_P=38,`enable_decimal256`=true 时,MAX_P=76。小数位数字数量 S 的范围是 [0,
P]。<br>`enable_decimal256` 的默认值是 false,设置为 true
可以获得更加精确的结果,但是会带来一些性能损失。<br>存储空间:<ul><li>0 < precision <= 9 时,占用 4 字节。<li>9 <
precision <= 18 时,占用 8 字节。<li>16 < precision <= 38 时,占用 16 字节。<li>38 <
precision [...]
+| [DECIMAL](../sql-manual/basic-element/sql-data-types/numeric/DECIMAL)
| 4/8/16/32 | 高精度定点数,格式:DECIMAL(P[,S])。其中,P 代表一共有多少个有效数字(precision),S
代表小数位有多少数字(scale)。有效数字 P 的范围是 [1, MAX_P],`enable_decimal256`=false
时,MAX_P=38,`enable_decimal256`=true 时,MAX_P=76。小数位数字数量 S 的范围是 [0,
P]。<br>`enable_decimal256` 的默认值是 false,设置为 true
可以获得更加精确的结果,但是会带来一些性能损失。<br>存储空间:<ul><li>0 < precision <= 9 时,占用 4
字节。</li><li>9 < precision <= 18 时,占用 8 字节。</li><li>18 < precision <= 38 时,占用 16
字节。</li><li> [...]
### [日期类型](../sql-manual/basic-element/sql-data-types/data-type-overview#日期类型)
diff --git
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.x/table-design/index/ngram-bloomfilter-index.md
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.x/table-design/index/ngram-bloomfilter-index.md
index 1e98ec73560..1223ba85cc9 100644
---
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.x/table-design/index/ngram-bloomfilter-index.md
+++
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.x/table-design/index/ngram-bloomfilter-index.md
@@ -179,7 +179,7 @@ mysql> SELECT COUNT() FROM amazon_reviews;
```
SELECT
product_id,
- any(product_title),
+ any_value(product_title),
AVG(star_rating) AS rating,
COUNT() AS count
FROM
diff --git
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/data-operate/import/data-source/bigquery.md
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/data-operate/import/data-source/bigquery.md
index b908ca53532..8a862fd9b2e 100644
---
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/data-operate/import/data-source/bigquery.md
+++
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/data-operate/import/data-source/bigquery.md
@@ -125,7 +125,7 @@ AS (
#### 2.2 查看 GCS 上的导出文件
-以上命令会将 `sales_data` 的数据导出到 GCS,每个分区会产生一个或多个文件,文件名递增。具体规则可参考
[exporting-data](https://cloud.google.com/bigquerydocs/exporting-data#exporting_data_into_one_or_more_files)。
+以上命令会将 `sales_data` 的数据导出到 GCS,每个分区会产生一个或多个文件,文件名递增。具体规则可参考
[exporting-data](https://cloud.google.com/bigquery/docs/exporting-data#exporting_data_into_one_or_more_files)。

diff --git
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/lakehouse/catalogs/jdbc-catalog-overview.md
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/lakehouse/catalogs/jdbc-catalog-overview.md
index fdf6c7d3cef..6dae931b15f 100644
---
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/lakehouse/catalogs/jdbc-catalog-overview.md
+++
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/lakehouse/catalogs/jdbc-catalog-overview.md
@@ -65,7 +65,7 @@ CREATE CATALOG [IF NOT EXISTS] catalog_name PROPERTIES (
2. 本地绝对路径。如 `file:///path/to/mysql-connector-j-8.3.0.jar`。需将 Jar
包预先存放在所有 FE/BE 节点指定的路径下。
- 3. Http
地址。如:`http://repo1.maven.org/maven2/com/mysql/mysql-connector-j/8.3.0/mysql-connector-j-8.3.0.jar`
系统会从这个 Http 地址下载 Driver 文件。仅支持无认证的 Http 服务。
+ 3. Http
地址。如:`https://repo1.maven.org/maven2/com/mysql/mysql-connector-j/8.3.0/mysql-connector-j-8.3.0.jar`
系统会从这个 Http 地址下载 Driver 文件。仅支持无认证的 Http 服务。
* 可选属性
diff --git
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/table-design/data-type.md
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/table-design/data-type.md
index 714216786d0..acef5f85849 100644
---
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/table-design/data-type.md
+++
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/table-design/data-type.md
@@ -21,7 +21,7 @@ Apache Doris 已支持的数据类型列表如下:
| [LARGEINT](../sql-manual/basic-element/sql-data-types/numeric/LARGEINT)
| 16 | 有符号整数,范围 [-2^127 + 1 ~ 2^127 - 1]。 |
| [FLOAT](../sql-manual/basic-element/sql-data-types/numeric/FLOATING-POINT)
| 4 | 浮点数,范围 [-3.4*10^38 ~ 3.4*10^38]。 |
| [DOUBLE](../sql-manual/basic-element/sql-data-types/numeric/FLOATING-POINT)
| 8 | 浮点数,范围 [-1.79*10^308 ~ 1.79*10^308]。 |
-| [DECIMAL](../sql-manual/basic-element/sql-data-types/numeric/DECIMAL)
| 4/8/16/32 | 高精度定点数,格式:DECIMAL(P[,S])。其中,P 代表一共有多少个有效数字(precision),S
代表小数位有多少数字(scale)。有效数字 P 的范围是 [1, MAX_P],`enable_decimal256`=false
时,MAX_P=38,`enable_decimal256`=true 时,MAX_P=76。小数位数字数量 S 的范围是 [0,
P]。<br>`enable_decimal256` 的默认值是 false,设置为 true
可以获得更加精确的结果,但是会带来一些性能损失。<br>存储空间:<ul><li>0 < precision <= 9 时,占用 4 字节。<li>9 <
precision <= 18 时,占用 8 字节。<li>16 < precision <= 38 时,占用 16 字节。<li>38 <
precision [...]
+| [DECIMAL](../sql-manual/basic-element/sql-data-types/numeric/DECIMAL)
| 4/8/16/32 | 高精度定点数,格式:DECIMAL(P[,S])。其中,P 代表一共有多少个有效数字(precision),S
代表小数位有多少数字(scale)。有效数字 P 的范围是 [1, MAX_P],`enable_decimal256`=false
时,MAX_P=38,`enable_decimal256`=true 时,MAX_P=76。小数位数字数量 S 的范围是 [0,
P]。<br>`enable_decimal256` 的默认值是 false,设置为 true
可以获得更加精确的结果,但是会带来一些性能损失。<br>存储空间:<ul><li>0 < precision <= 9 时,占用 4
字节。</li><li>9 < precision <= 18 时,占用 8 字节。</li><li>18 < precision <= 38 时,占用 16
字节。</li><li> [...]
### [日期类型](../sql-manual/basic-element/sql-data-types/data-type-overview#日期类型)
diff --git
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/table-design/index/inverted-index/custom-analyzer.md
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/table-design/index/inverted-index/custom-analyzer.md
index 3c1cde7f49c..803ec96eed9 100644
---
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/table-design/index/inverted-index/custom-analyzer.md
+++
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/table-design/index/inverted-index/custom-analyzer.md
@@ -91,8 +91,8 @@ PROPERTIES (
| 类型 | 说明 | 主要参数 |
| --- | --- | --- |
| `standard` | 标准分词(遵循 Unicode 文本分割),适用于多数语言 | 无 |
-| `ngram` | 按 N 元组切分 | `min_ngram`、`max_ngram`、`token_chars` |
-| `edge_ngram` | 从词首起始位置生成 N 元组 | `min_ngram`、`max_ngram`、`token_chars` |
+| `ngram` | 按 N 元组切分 | `min_gram`、`max_gram`、`token_chars` |
+| `edge_ngram` | 从词首起始位置生成 N 元组 | `min_gram`、`max_gram`、`token_chars` |
| `keyword` | 整段文本作为一个词项输出,常与 token_filter 组合 | 无 |
| `char_group` | 按给定字符切分 | `tokenize_on_chars` |
| `basic` | 简单英文 / 数字 / 中文 / Unicode 分词 | `extra_chars` |
@@ -100,8 +100,8 @@ PROPERTIES (
参数说明:
-- `min_ngram`:最小长度(默认 1)
-- `max_ngram`:最大长度(默认 2)
+- `min_gram`:最小长度(默认 1)
+- `max_gram`:最大长度(默认 2)
-
`token_chars`:保留字符类别(默认保留全部)。可选:`letter`、`digit`、`whitespace`、`punctuation`、`symbol`
- `tokenize_on_chars`:字符列表或类别,类别支持
`whitespace`、`letter`、`digit`、`punctuation`、`symbol`、`cjk`
- `extra_chars`:额外分割的 ASCII 字符(如 `[]().`)
diff --git
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/table-design/index/ngram-bloomfilter-index.md
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/table-design/index/ngram-bloomfilter-index.md
index 784979c65f5..aefbe1b2d0e 100644
---
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/table-design/index/ngram-bloomfilter-index.md
+++
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/table-design/index/ngram-bloomfilter-index.md
@@ -239,7 +239,7 @@ mysql> SELECT COUNT() FROM amazon_reviews;
```sql
SELECT
product_id,
- any(product_title),
+ any_value(product_title),
AVG(star_rating) AS rating,
COUNT() AS count
FROM
diff --git
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/table-design/index/vector-index/hnsw.md
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/table-design/index/vector-index/hnsw.md
index 05ddb68fef0..bc941bb2504 100644
---
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/table-design/index/vector-index/hnsw.md
+++
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/table-design/index/vector-index/hnsw.md
@@ -418,7 +418,7 @@ Doris 的 ANN 索引基于 Meta 开源的
[faiss](https://github.com/facebookres
#### 内存空间与性能
-> **HNSW 索引(无量化压缩)占用的内存空间约为其检索向量内存大小的 1.2 倍。**
+> **HNSW 索引(无量化压缩)占用的内存空间约为其检索向量内存大小的 1.3 倍。**
例如 128 维、1M 数据集,HNSW FLAT 索引大约需要 `128 × 4 × 1,000,000 × 1.3 ≈ 650 MB`。
diff --git
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/table-design/index/vector-index/ivf.md
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/table-design/index/vector-index/ivf.md
index 37afe8af615..fdccc710528 100644
---
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/table-design/index/vector-index/ivf.md
+++
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/table-design/index/vector-index/ivf.md
@@ -396,7 +396,7 @@ Doris 的 ANN 索引基于 Meta 开源的
[faiss](https://github.com/facebookres
| dim | rows | 预估内存 |
| --- | --- | --- |
-| 128 | 1M | 496 MB |
+| 128 | 1M | 500 MB |
| 768 | 1M | 2.9 GB |
为保证查询性能,BE 必须有足够的内存容纳全部索引;否则索引文件频繁 IO 会导致查询性能大幅衰减。
diff --git
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/table-design/index/vector-index/overview.md
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/table-design/index/vector-index/overview.md
index d3a91b99311..844c92cb69a 100644
---
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/table-design/index/vector-index/overview.md
+++
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/table-design/index/vector-index/overview.md
@@ -478,7 +478,7 @@ PROPERTIES (
```java
// use `?` for placement holders, readStatement should be reused
PreparedStatement readStatement = conn.prepareStatement("SELECT id,
l2_distance_approximate(embedding, cast (? as ARRAY<FLOAT>)) AS distance
- FROM l2_distance_approximate
+ FROM sift_1M
ORDER BY distance
LIMIT 10");
diff --git
a/versioned_docs/version-2.1/lakehouse/catalogs/jdbc-catalog-overview.md
b/versioned_docs/version-2.1/lakehouse/catalogs/jdbc-catalog-overview.md
index faf147387cc..614408b718d 100644
--- a/versioned_docs/version-2.1/lakehouse/catalogs/jdbc-catalog-overview.md
+++ b/versioned_docs/version-2.1/lakehouse/catalogs/jdbc-catalog-overview.md
@@ -65,7 +65,7 @@ CREATE CATALOG [IF NOT EXISTS] catalog_name PROPERTIES (
2. Local absolute path. For example,
`file:///path/to/mysql-connector-j-8.3.0.jar`. The Jar file must be pre-placed
in the specified path on all FE/BE nodes.
- 3. HTTP URL. For example:
`http://repo1.maven.org/maven2/com/mysql/mysql-connector-j/8.3.0/mysql-connector-j-8.3.0.jar`.
The system will download the driver file from this HTTP address. Only supports
HTTP services without authentication.
+ 3. HTTP URL. For example:
`https://repo1.maven.org/maven2/com/mysql/mysql-connector-j/8.3.0/mysql-connector-j-8.3.0.jar`.
The system will download the driver file from this HTTP address. Only supports
HTTP services without authentication.
* Optional Properties
diff --git
a/versioned_docs/version-2.1/sql-manual/sql-statements/table-and-view/view/SHOW-VIEW.md
b/versioned_docs/version-2.1/sql-manual/sql-statements/table-and-view/view/SHOW-VIEW.md
index 9fbb09c9696..dbe9857ff6e 100644
---
a/versioned_docs/version-2.1/sql-manual/sql-statements/table-and-view/view/SHOW-VIEW.md
+++
b/versioned_docs/version-2.1/sql-manual/sql-statements/table-and-view/view/SHOW-VIEW.md
@@ -10,7 +10,7 @@
This statement is used to display all views based on the given table
-grammar:
+## Syntax
```sql
SHOW VIEW { FROM | IN } table [ FROM db ]
diff --git
a/versioned_docs/version-2.1/table-design/index/ngram-bloomfilter-index.md
b/versioned_docs/version-2.1/table-design/index/ngram-bloomfilter-index.md
index 705d2aa385c..9303c1f0306 100644
--- a/versioned_docs/version-2.1/table-design/index/ngram-bloomfilter-index.md
+++ b/versioned_docs/version-2.1/table-design/index/ngram-bloomfilter-index.md
@@ -169,7 +169,7 @@ mysql> SELECT COUNT(*) FROM amazon_reviews;
```sql
SELECT
product_id,
- any(product_title),
+ any_value(product_title),
AVG(star_rating) AS rating,
COUNT(*) AS count
FROM
diff --git
a/versioned_docs/version-3.x/lakehouse/catalogs/jdbc-catalog-overview.md
b/versioned_docs/version-3.x/lakehouse/catalogs/jdbc-catalog-overview.md
index faf147387cc..614408b718d 100644
--- a/versioned_docs/version-3.x/lakehouse/catalogs/jdbc-catalog-overview.md
+++ b/versioned_docs/version-3.x/lakehouse/catalogs/jdbc-catalog-overview.md
@@ -65,7 +65,7 @@ CREATE CATALOG [IF NOT EXISTS] catalog_name PROPERTIES (
2. Local absolute path. For example,
`file:///path/to/mysql-connector-j-8.3.0.jar`. The Jar file must be pre-placed
in the specified path on all FE/BE nodes.
- 3. HTTP URL. For example:
`http://repo1.maven.org/maven2/com/mysql/mysql-connector-j/8.3.0/mysql-connector-j-8.3.0.jar`.
The system will download the driver file from this HTTP address. Only supports
HTTP services without authentication.
+ 3. HTTP URL. For example:
`https://repo1.maven.org/maven2/com/mysql/mysql-connector-j/8.3.0/mysql-connector-j-8.3.0.jar`.
The system will download the driver file from this HTTP address. Only supports
HTTP services without authentication.
* Optional Properties
diff --git
a/versioned_docs/version-3.x/sql-manual/sql-statements/table-and-view/view/SHOW-VIEW.md
b/versioned_docs/version-3.x/sql-manual/sql-statements/table-and-view/view/SHOW-VIEW.md
index 312fee6ff54..259700e46b3 100644
---
a/versioned_docs/version-3.x/sql-manual/sql-statements/table-and-view/view/SHOW-VIEW.md
+++
b/versioned_docs/version-3.x/sql-manual/sql-statements/table-and-view/view/SHOW-VIEW.md
@@ -10,7 +10,7 @@
This statement is used to display all views based on the given table
-grammar:
+## Syntax
```sql
SHOW VIEW { FROM | IN } table [ FROM db ]
@@ -28,5 +28,3 @@ grammar:
SHOW, VIEW
-## Best Practice
-
diff --git
a/versioned_docs/version-3.x/table-design/index/ngram-bloomfilter-index.md
b/versioned_docs/version-3.x/table-design/index/ngram-bloomfilter-index.md
index 705d2aa385c..9303c1f0306 100644
--- a/versioned_docs/version-3.x/table-design/index/ngram-bloomfilter-index.md
+++ b/versioned_docs/version-3.x/table-design/index/ngram-bloomfilter-index.md
@@ -169,7 +169,7 @@ mysql> SELECT COUNT(*) FROM amazon_reviews;
```sql
SELECT
product_id,
- any(product_title),
+ any_value(product_title),
AVG(star_rating) AS rating,
COUNT(*) AS count
FROM
diff --git
a/versioned_docs/version-4.x/data-operate/import/data-source/bigquery.md
b/versioned_docs/version-4.x/data-operate/import/data-source/bigquery.md
index 1f1a10b6b0a..d7220d9c24c 100644
--- a/versioned_docs/version-4.x/data-operate/import/data-source/bigquery.md
+++ b/versioned_docs/version-4.x/data-operate/import/data-source/bigquery.md
@@ -125,7 +125,7 @@ AS (
#### 2.2 Inspect the exported files on GCS
-The command above exports `sales_data` to GCS. Each partition produces one or
more files with incrementing file names. For details, see
[exporting-data](https://cloud.google.com/bigquerydocs/exporting-data#exporting_data_into_one_or_more_files).
+The command above exports `sales_data` to GCS. Each partition produces one or
more files with incrementing file names. For details, see
[exporting-data](https://cloud.google.com/bigquery/docs/exporting-data#exporting_data_into_one_or_more_files).

diff --git
a/versioned_docs/version-4.x/lakehouse/catalogs/jdbc-catalog-overview.md
b/versioned_docs/version-4.x/lakehouse/catalogs/jdbc-catalog-overview.md
index faf147387cc..614408b718d 100644
--- a/versioned_docs/version-4.x/lakehouse/catalogs/jdbc-catalog-overview.md
+++ b/versioned_docs/version-4.x/lakehouse/catalogs/jdbc-catalog-overview.md
@@ -65,7 +65,7 @@ CREATE CATALOG [IF NOT EXISTS] catalog_name PROPERTIES (
2. Local absolute path. For example,
`file:///path/to/mysql-connector-j-8.3.0.jar`. The Jar file must be pre-placed
in the specified path on all FE/BE nodes.
- 3. HTTP URL. For example:
`http://repo1.maven.org/maven2/com/mysql/mysql-connector-j/8.3.0/mysql-connector-j-8.3.0.jar`.
The system will download the driver file from this HTTP address. Only supports
HTTP services without authentication.
+ 3. HTTP URL. For example:
`https://repo1.maven.org/maven2/com/mysql/mysql-connector-j/8.3.0/mysql-connector-j-8.3.0.jar`.
The system will download the driver file from this HTTP address. Only supports
HTTP services without authentication.
* Optional Properties
diff --git
a/versioned_docs/version-4.x/sql-manual/sql-statements/table-and-view/view/SHOW-VIEW.md
b/versioned_docs/version-4.x/sql-manual/sql-statements/table-and-view/view/SHOW-VIEW.md
index 312fee6ff54..259700e46b3 100644
---
a/versioned_docs/version-4.x/sql-manual/sql-statements/table-and-view/view/SHOW-VIEW.md
+++
b/versioned_docs/version-4.x/sql-manual/sql-statements/table-and-view/view/SHOW-VIEW.md
@@ -10,7 +10,7 @@
This statement is used to display all views based on the given table
-grammar:
+## Syntax
```sql
SHOW VIEW { FROM | IN } table [ FROM db ]
@@ -28,5 +28,3 @@ grammar:
SHOW, VIEW
-## Best Practice
-
diff --git a/versioned_docs/version-4.x/table-design/data-type.md
b/versioned_docs/version-4.x/table-design/data-type.md
index e634e5876d4..544fa9d81b1 100644
--- a/versioned_docs/version-4.x/table-design/data-type.md
+++ b/versioned_docs/version-4.x/table-design/data-type.md
@@ -21,7 +21,7 @@ The list of data types supported by Apache Doris is as
follows:
| [LARGEINT](../sql-manual/basic-element/sql-data-types/numeric/LARGEINT)
| 16 | Signed integer, range [-2^127 + 1 ~ 2^127 - 1]. |
| [FLOAT](../sql-manual/basic-element/sql-data-types/numeric/FLOATING-POINT)
| 4 | Floating-point number, range [-3.4*10^38 ~ 3.4*10^38].
|
| [DOUBLE](../sql-manual/basic-element/sql-data-types/numeric/FLOATING-POINT)
| 8 | Floating-point number, range [-1.79*10^308 ~ 1.79*10^308].
|
-| [DECIMAL](../sql-manual/basic-element/sql-data-types/numeric/DECIMAL)
| 4/8/16/32 | High-precision fixed-point number. Format: DECIMAL(P[,S]). P
represents the total number of significant digits (precision), and S represents
the number of digits after the decimal point (scale). The range of P is [1,
MAX_P]. When `enable_decimal256`=false, MAX_P=38; when
`enable_decimal256`=true, MAX_P=76. The range of S is [0, P].<br>The default
value of `enable_decimal256` is false. Setting [...]
+| [DECIMAL](../sql-manual/basic-element/sql-data-types/numeric/DECIMAL)
| 4/8/16/32 | High-precision fixed-point number. Format: DECIMAL(P[,S]). P
represents the total number of significant digits (precision), and S represents
the number of digits after the decimal point (scale). The range of P is [1,
MAX_P]. When `enable_decimal256`=false, MAX_P=38; when
`enable_decimal256`=true, MAX_P=76. The range of S is [0, P].<br>The default
value of `enable_decimal256` is false. Setting [...]
### [Date
Types](../sql-manual/basic-element/sql-data-types/data-type-overview#date-types)
diff --git
a/versioned_docs/version-4.x/table-design/index/inverted-index/custom-analyzer.md
b/versioned_docs/version-4.x/table-design/index/inverted-index/custom-analyzer.md
index 3efbc55b36b..3b901de8cae 100644
---
a/versioned_docs/version-4.x/table-design/index/inverted-index/custom-analyzer.md
+++
b/versioned_docs/version-4.x/table-design/index/inverted-index/custom-analyzer.md
@@ -4,7 +4,6 @@
"language": "en",
"description": "Doris custom analyzers combine character filters,
tokenizers, and token filters to flexibly control text segmentation strategies,
improving the search relevance and precision of inverted indexes.",
"keywords": [
- "custom analyzer",
"custom analyzer",
"inverted index tokenizer",
"tokenizer",
@@ -91,8 +90,8 @@ Supported tokenizer types:
| Type | Description | Main Parameters |
| --- | --- | --- |
| `standard` | Standard tokenization (follows Unicode text segmentation),
suitable for most languages | None |
-| `ngram` | Splits by N-grams | `min_ngram`, `max_ngram`, `token_chars` |
-| `edge_ngram` | Generates N-grams starting from the beginning of the word |
`min_ngram`, `max_ngram`, `token_chars` |
+| `ngram` | Splits by N-grams | `min_gram`, `max_gram`, `token_chars` |
+| `edge_ngram` | Generates N-grams starting from the beginning of the word |
`min_gram`, `max_gram`, `token_chars` |
| `keyword` | Outputs the entire text as a single term, often combined with
token_filter | None |
| `char_group` | Splits by the given characters | `tokenize_on_chars` |
| `basic` | Simple English / digit / Chinese / Unicode tokenization |
`extra_chars` |
@@ -100,8 +99,8 @@ Supported tokenizer types:
Parameter descriptions:
-- `min_ngram`: minimum length (default 1)
-- `max_ngram`: maximum length (default 2)
+- `min_gram`: minimum length (default 1)
+- `max_gram`: maximum length (default 2)
- `token_chars`: character categories to keep (default: keep all). Options:
`letter`, `digit`, `whitespace`, `punctuation`, `symbol`
- `tokenize_on_chars`: a character list or category. Categories support
`whitespace`, `letter`, `digit`, `punctuation`, `symbol`, `cjk`
- `extra_chars`: additional ASCII characters to split on (such as `[]().`)
@@ -503,5 +502,3 @@ Result:
1. Nesting multiple components in a custom `analyzer` may degrade tokenization
performance.
2. The `select tokenize` tokenization function supports custom analyzers and
can be used to debug tokenization results.
3. Only one of the predefined `built_in_analyzer` and a custom `analyzer` can
exist on the same index.
-</content>
-</invoke>
\ No newline at end of file
diff --git
a/versioned_docs/version-4.x/table-design/index/ngram-bloomfilter-index.md
b/versioned_docs/version-4.x/table-design/index/ngram-bloomfilter-index.md
index c30319a6fc5..844b3bf1001 100644
--- a/versioned_docs/version-4.x/table-design/index/ngram-bloomfilter-index.md
+++ b/versioned_docs/version-4.x/table-design/index/ngram-bloomfilter-index.md
@@ -239,7 +239,7 @@ mysql> SELECT COUNT() FROM amazon_reviews;
```sql
SELECT
product_id,
- any(product_title),
+ any_value(product_title),
AVG(star_rating) AS rating,
COUNT() AS count
FROM
diff --git a/versioned_docs/version-4.x/table-design/index/prefix-index.md
b/versioned_docs/version-4.x/table-design/index/prefix-index.md
index e9dcb06a354..9d86959ea02 100644
--- a/versioned_docs/version-4.x/table-design/index/prefix-index.md
+++ b/versioned_docs/version-4.x/table-design/index/prefix-index.md
@@ -6,8 +6,6 @@
"keywords": [
"Prefix Index",
"Sort Key",
- "Sort Key",
- "Prefix Index",
"Apache Doris",
"sparse index",
"query acceleration",
diff --git a/versioned_docs/version-4.x/table-design/index/vector-index/hnsw.md
b/versioned_docs/version-4.x/table-design/index/vector-index/hnsw.md
index b1ec0dd0823..8dbad1d57ad 100644
--- a/versioned_docs/version-4.x/table-design/index/vector-index/hnsw.md
+++ b/versioned_docs/version-4.x/table-design/index/vector-index/hnsw.md
@@ -418,7 +418,7 @@ Before high-concurrency queries, run a cold query first to
warm up the index fil
#### Memory Footprint and Performance
-> **An HNSW index (without quantization compression) takes about 1.2x the
memory of the vectors it indexes.**
+> **An HNSW index (without quantization compression) takes about 1.3x the
memory of the vectors it indexes.**
For example, for a 128-dimensional, 1M dataset, an HNSW FLAT index needs about
`128 x 4 x 1,000,000 x 1.3 ~= 650 MB`.
diff --git a/versioned_docs/version-4.x/table-design/index/vector-index/ivf.md
b/versioned_docs/version-4.x/table-design/index/vector-index/ivf.md
index 78050cde077..d0ab6383c71 100644
--- a/versioned_docs/version-4.x/table-design/index/vector-index/ivf.md
+++ b/versioned_docs/version-4.x/table-design/index/vector-index/ivf.md
@@ -395,7 +395,7 @@ Reference values:
| dim | rows | Estimated memory |
| --- | --- | --- |
-| 128 | 1M | 496 MB |
+| 128 | 1M | 500 MB |
| 768 | 1M | 2.9 GB |
To guarantee query performance, the BE must have enough memory to hold the
entire index. Otherwise, frequent IO on index files causes severe query
performance degradation.
diff --git
a/versioned_docs/version-4.x/table-design/index/vector-index/overview.md
b/versioned_docs/version-4.x/table-design/index/vector-index/overview.md
index 7797e7a873d..9fe2be69766 100644
--- a/versioned_docs/version-4.x/table-design/index/vector-index/overview.md
+++ b/versioned_docs/version-4.x/table-design/index/vector-index/overview.md
@@ -478,7 +478,7 @@ Common embedding-model outputs are typically 768 dimensions
or higher. If you em
```java
// use `?` for placement holders, readStatement should be reused
PreparedStatement readStatement = conn.prepareStatement("SELECT id,
l2_distance_approximate(embedding, cast (? as ARRAY<FLOAT>)) AS distance
- FROM l2_distance_approximate
+ FROM sift_1M
ORDER BY distance
LIMIT 10");
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]