(doris-website) branch master updated: [docs] Add file path pattern documentation for S3 TVF and Broker Load (#3337)

morningman Wed, 04 Feb 2026 19:34:54 -0800

This is an automated email from the ASF dual-hosted git repository.

morningman pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/doris-website.git



The following commit(s) were added to refs/heads/master by this push:
     new 7466507bd5b [docs] Add file path pattern documentation for S3 TVF and 
Broker Load (#3337)
7466507bd5b is described below

commit 7466507bd5bf4d1fe341239981e80dde9b83d4d5
Author: Yongqiang YANG <[email protected]>
AuthorDate: Wed Feb 4 19:34:41 2026 -0800

    [docs] Add file path pattern documentation for S3 TVF and Broker Load 
(#3337)
    
    ## Summary
    
    - Add new documentation page for file path patterns under
    `sql-manual/basic-element`
    - Document supported URI formats (S3, HDFS, and other cloud providers)
    - Document wildcard patterns (`*`, `?`, `[...]`) and range expansion
    (`{1..10}`)
    - Add usage examples for S3 TVF, Broker Load, and INSERT INTO SELECT
    - Include performance considerations and troubleshooting guide
    - Update S3 TVF documentation to reference the new file-path-pattern
    page
    - Update Broker Load documentation to reference the new
    file-path-pattern page
    - Add sidebar entry for new documentation
    
    ## Changes
    
    1. **New file**: `docs/sql-manual/basic-element/file-path-pattern.md`
       - Comprehensive documentation on file path patterns
       - Covers S3-style URIs, HDFS URIs, and other cloud storage
       - Documents wildcard patterns and range expansion syntax
       - Includes practical examples and performance tips
    
    2. **Updated**:
    `docs/sql-manual/sql-functions/table-valued-functions/s3.md`
    - Added reference to file-path-pattern documentation in URI parameter
    description
    - Updated "URI with Wildcards" section with reference to comprehensive
    docs
    
    3. **Updated**:
    `docs/data-operate/import/import-way/broker-load-manual.md`
       - Added "Supported file path patterns" section in Limitations
    - Added reference to file-path-pattern documentation in wildcard example
    section
    
    4. **Updated**: `sidebars.ts`
       - Added sidebar entry for `file-path-pattern` under Basic Elements
    
    ## Test plan
    
    - [ ] Verify documentation builds correctly
    - [ ] Verify sidebar navigation works
    - [ ] Verify cross-references are correct
    
    🤖 Generated with [Claude Code](https://claude.com/claude-code)
---
 .../import/import-way/broker-load-manual.md        |   8 +
 .../import/import-way/insert-into-manual.md        |   4 +-
 docs/lakehouse/file-analysis.md                    |  20 +-
 docs/sql-manual/basic-element/file-path-pattern.md | 309 +++++++++++++++++++++
 .../sql-functions/table-valued-functions/s3.md     |   4 +-
 .../import/import-way/broker-load-manual.md        |   8 +
 .../import/import-way/insert-into-manual.md        |   4 +-
 .../current/lakehouse/file-analysis.md             |  20 +-
 .../sql-manual/basic-element/file-path-pattern.md  | 309 +++++++++++++++++++++
 .../sql-functions/table-valued-functions/s3.md     |   4 +-
 .../import/import-way/broker-load-manual.md        |   8 +
 .../import/import-way/insert-into-manual.md        |   4 +-
 .../version-4.x/lakehouse/file-analysis.md         |  20 +-
 .../sql-manual/basic-element/file-path-pattern.md  | 309 +++++++++++++++++++++
 .../sql-functions/table-valued-functions/s3.md     |   5 +-
 sidebars.ts                                        |   1 +
 .../import/import-way/broker-load-manual.md        |   8 +
 .../import/import-way/insert-into-manual.md        |   4 +-
 .../version-4.x/lakehouse/file-analysis.md         |  20 +-
 .../sql-manual/basic-element/file-path-pattern.md  | 309 +++++++++++++++++++++
 .../sql-functions/table-valued-functions/s3.md     |   4 +-
 versioned_sidebars/version-4.x-sidebars.json       |   1 +
 22 files changed, 1318 insertions(+), 65 deletions(-)

diff --git a/docs/data-operate/import/import-way/broker-load-manual.md 
b/docs/data-operate/import/import-way/broker-load-manual.md
index f292219832e..89bbba9fe15 100644
--- a/docs/data-operate/import/import-way/broker-load-manual.md
+++ b/docs/data-operate/import/import-way/broker-load-manual.md
@@ -24,6 +24,12 @@ Supported data sources:
 - HDFS protocol
 - Custom protocol (require broker process)
 
+Supported file path patterns:
+
+- Wildcards: `*`, `?`, `[abc]`, `[a-z]`
+- Range expansion: `{1..10}`, `{a,b,c}`
+- See [File Path Pattern](../../../sql-manual/basic-element/file-path-pattern) 
for complete syntax
+
 Supported data types:
 
 - CSV
@@ -558,6 +564,8 @@ Different Broker types and access methods require different 
authentication infor
 
 ### Importing data from HDFS using wildcards to match two batches of files and 
importing them into two separate tables
 
+  Broker Load supports wildcards (`*`, `?`, `[...]`) and range patterns 
(`{1..10}`) in file paths. For detailed syntax, see [File Path 
Pattern](../../../sql-manual/basic-element/file-path-pattern).
+
   ```sql
   LOAD LABEL example_db.label2
   (
diff --git a/docs/data-operate/import/import-way/insert-into-manual.md 
b/docs/data-operate/import/import-way/insert-into-manual.md
index ae76125ec16..9a287c7b303 100644
--- a/docs/data-operate/import/import-way/insert-into-manual.md
+++ b/docs/data-operate/import/import-way/insert-into-manual.md
@@ -312,7 +312,9 @@ The INSERT command is a synchronous command. If it returns 
a result, that indica
 
 ## Ingest data by TVF
 
-Doris can directly query and analyze files stored in object storage or HDFS as 
tables through the Table Value Functions (TVFs), which supports automatic 
column type inference. For detailed information, please refer to the 
[Lakehouse/TVF 
documentation](https://doris.apache.org/docs/3.0/lakehouse/file-analysis).
+Doris can directly query and analyze files stored in object storage or HDFS as 
tables through the Table Value Functions (TVFs), which supports automatic 
column type inference. For detailed information, please refer to the 
[Lakehouse/TVF documentation](../../../lakehouse/file-analysis).
+
+TVF supports wildcards (`*`, `?`, `[...]`) and range patterns (`{1..10}`) in 
file paths. For complete syntax, see [File Path 
Pattern](../../../sql-manual/basic-element/file-path-pattern).
 
 ### Automatic column type inference
 
diff --git a/docs/lakehouse/file-analysis.md b/docs/lakehouse/file-analysis.md
index c4b1d614c05..d739a51f71f 100644
--- a/docs/lakehouse/file-analysis.md
+++ b/docs/lakehouse/file-analysis.md
@@ -39,21 +39,15 @@ The attributes of a TVF include the file path to be 
analyzed, file format, conne
 
 ### Multiple File Import
 
-When importing, the file path (URI) supports wildcards for matching. Doris 
file path matching uses the [Glob matching 
pattern](https://en.wikipedia.org/wiki/Glob_(programming)#:~:text=glob%20%28%29%20%28%2F%20%C9%A1l%C9%92b%20%2F%29%20is%20a%20libc,into%20a%20list%20of%20names%20matching%20that%20pattern.),
 and has been extended on this basis to support more flexible file selection 
methods.
+The file path (URI) supports wildcards and range patterns for matching 
multiple files:
 
-- `file_{1..3}`: Matches files `file_1`, `file_2`, `file_3`
-- `file_{1,3}_{1,2}`: Matches files `file_1_1`, `file_1_2`, `file_3_1`, 
`file_3_2` (supports mixing with `{n..m}` notation, separated by commas)
-- `file_*`: Matches all files starting with `file_`
-- `*.parquet`: Matches all files with the `.parquet` suffix
-- `tvf_test/*`: Matches all files in the `tvf_test` directory
-- `*test*`: Matches files containing `test` in the filename
+| Pattern | Example | Matches |
+|---------|---------|---------|
+| `*` | `file_*` | All files starting with `file_` |
+| `{n..m}` | `file_{1..3}` | `file_1`, `file_2`, `file_3` |
+| `{a,b,c}` | `file_{a,b}` | `file_a`, `file_b` |
 
-**Notes**
-
-- In the `{1..3}` notation, the order can be reversed, `{3..1}` is also valid.
-- Notations like `file_{-1..2}` and `file_{a..4}` are not supported, as 
negative numbers or letters cannot be used as enumeration endpoints. However, 
`file_{1..3,11,a}` is allowed and will match files `file_1`, `file_2`, 
`file_3`, `file_11`, and `file_a`.
-- Doris tries to import as many files as possible. For paths like 
`file_{a..b,-1..3,4..5}` that contain incorrect notation, we will match files 
`file_4` and `file_5`.
-- When using commas with `{1..4,5}`, only numbers are allowed. Expressions 
like `{1..4,a}` are not supported; in this case, `{a}` will be ignored.
+For complete syntax including all supported wildcards, range expansion rules, 
and usage examples, see [File Path 
Pattern](../sql-manual/basic-element/file-path-pattern).
 
 
 ### Automatic Inference of File Column Types
diff --git a/docs/sql-manual/basic-element/file-path-pattern.md 
b/docs/sql-manual/basic-element/file-path-pattern.md
new file mode 100644
index 00000000000..77bcc96e99a
--- /dev/null
+++ b/docs/sql-manual/basic-element/file-path-pattern.md
@@ -0,0 +1,309 @@
+---
+{
+    "title": "File Path Pattern",
+    "language": "en",
+    "description": "File path patterns and wildcards supported by Doris for 
accessing files in remote storage systems like S3, HDFS, and other object 
storage."
+}
+---
+
+## Description
+
+When accessing files from remote storage systems (S3, HDFS, and other 
S3-compatible object storage), Doris supports flexible file path patterns 
including wildcards and range expressions. This document describes the 
supported path formats and pattern matching syntax.
+
+These path patterns are supported by:
+- [S3 TVF](../sql-functions/table-valued-functions/s3)
+- [HDFS TVF](../sql-functions/table-valued-functions/hdfs)
+- [Broker Load](../../data-operate/import/import-way/broker-load-manual)
+- INSERT INTO SELECT from TVF
+
+## Supported URI Formats
+
+### S3-Style URIs
+
+| Style | Format | Example |
+|-------|--------|---------|
+| AWS Client Style (Hadoop S3) | `s3://bucket/path/to/file` | 
`s3://my-bucket/data/file.csv` |
+| S3A Style | `s3a://bucket/path/to/file` | `s3a://my-bucket/data/file.csv` |
+| S3N Style | `s3n://bucket/path/to/file` | `s3n://my-bucket/data/file.csv` |
+| Virtual Host Style | `https://bucket.endpoint/path/to/file` | 
`https://my-bucket.s3.us-west-1.amazonaws.com/data/file.csv` |
+| Path Style | `https://endpoint/bucket/path/to/file` | 
`https://s3.us-west-1.amazonaws.com/my-bucket/data/file.csv` |
+
+### Other Cloud Storage URIs
+
+| Provider | Scheme | Example |
+|----------|--------|---------|
+| Alibaba Cloud OSS | `oss://` | `oss://my-bucket/data/file.csv` |
+| Tencent Cloud COS | `cos://`, `cosn://` | `cos://my-bucket/data/file.csv` |
+| Baidu Cloud BOS | `bos://` | `bos://my-bucket/data/file.csv` |
+| Huawei Cloud OBS | `obs://` | `obs://my-bucket/data/file.csv` |
+| Google Cloud Storage | `gs://` | `gs://my-bucket/data/file.csv` |
+| Azure Blob Storage | `azure://` | `azure://container/data/file.csv` |
+
+### HDFS URIs
+
+| Style | Format | Example |
+|-------|--------|---------|
+| Standard | `hdfs://namenode:port/path/to/file` | 
`hdfs://namenode:8020/user/data/file.csv` |
+| HA Mode | `hdfs://nameservice/path/to/file` | 
`hdfs://my-ha-cluster/user/data/file.csv` |
+
+## Wildcard Patterns
+
+Doris uses glob-style pattern matching for file paths. The following wildcards 
are supported:
+
+### Basic Wildcards
+
+| Pattern | Description | Example | Matches |
+|---------|-------------|---------|---------|
+| `*` | Matches zero or more characters within a path segment | `*.csv` | 
`file.csv`, `data.csv`, `a.csv` |
+| `?` | Matches exactly one character | `file?.csv` | `file1.csv`, 
`fileA.csv`, but not `file10.csv` |
+| `[abc]` | Matches any single character in brackets | `file[123].csv` | 
`file1.csv`, `file2.csv`, `file3.csv` |
+| `[a-z]` | Matches any single character in the range | `file[a-c].csv` | 
`filea.csv`, `fileb.csv`, `filec.csv` |
+| `[!abc]` | Matches any single character NOT in brackets | `file[!0-9].csv` | 
`filea.csv`, `fileb.csv`, but not `file1.csv` |
+
+### Range Expansion (Brace Patterns)
+
+Doris supports numeric range expansion using brace patterns `{start..end}`:
+
+| Pattern | Expansion | Matches |
+|---------|-----------|---------|
+| `{1..3}` | `{1,2,3}` | `1`, `2`, `3` |
+| `{01..05}` | `{1,2,3,4,5}` | `1`, `2`, `3`, `4`, `5` (leading zeros are NOT 
preserved) |
+| `{3..1}` | `{1,2,3}` | `1`, `2`, `3` (reverse ranges supported) |
+| `{a,b,c}` | `{a,b,c}` | `a`, `b`, `c` (enumeration) |
+| `{1..3,5,7..9}` | `{1,2,3,5,7,8,9}` | Mixed ranges and values |
+
+:::caution Note
+- Doris tries to match as many files as possible. Invalid parts in brace 
expressions are silently skipped, and valid parts are still expanded. For 
example, `file_{a..b,-1..3,4..5}` will match `file_4` and `file_5` (the invalid 
`a..b` and negative range `-1..3` are skipped, but `4..5` is expanded normally).
+- If the entire range is negative (e.g., `{-1..2}`), the range is skipped. If 
mixed with valid ranges (e.g., `{-1..2,1..3}`), only the valid range `1..3` is 
expanded.
+- When using comma-separated values with ranges, only numbers are allowed. For 
example, in `{1..4,a}`, the non-numeric `a` will be ignored, resulting in 
`{1,2,3,4}`.
+- Pure enumeration patterns like `{a,b,c}` (without `..` ranges) are passed 
directly to glob matching and work as expected.
+:::
+
+### Combining Patterns
+
+Multiple patterns can be combined in a single path:
+
+```
+s3://bucket/data_{1..3}/file_*.csv
+```
+
+This matches:
+- `s3://bucket/data_1/file_a.csv`
+- `s3://bucket/data_1/file_b.csv`
+- `s3://bucket/data_2/file_a.csv`
+- ... and so on
+
+## Examples
+
+### S3 TVF Examples
+
+**Match all CSV files in a directory:**
+
+```sql
+SELECT * FROM S3(
+    "uri" = "s3://my-bucket/data/*.csv",
+    "s3.access_key" = "xxx",
+    "s3.secret_key" = "xxx",
+    "s3.region" = "us-east-1",
+    "format" = "csv"
+);
+```
+
+**Match files with numeric range:**
+
+```sql
+SELECT * FROM S3(
+    "uri" = "s3://my-bucket/logs/data_{1..10}.csv",
+    "s3.access_key" = "xxx",
+    "s3.secret_key" = "xxx",
+    "s3.region" = "us-east-1",
+    "format" = "csv"
+);
+```
+
+**Match files in date-partitioned directories:**
+
+```sql
+SELECT * FROM S3(
+    "uri" = "s3://my-bucket/logs/year=2024/month=*/day=*/data.parquet",
+    "s3.access_key" = "xxx",
+    "s3.secret_key" = "xxx",
+    "s3.region" = "us-east-1",
+    "format" = "parquet"
+);
+```
+
+:::caution Zero-Padded Directories
+For zero-padded directory names like `month=01`, `month=02`, use wildcards 
(`*`) instead of range patterns. The pattern `{01..12}` expands to 
`{1,2,...,12}` which won't match `month=01`.
+:::
+
+**Match numbered file splits (e.g., Spark output):**
+
+```sql
+SELECT * FROM S3(
+    "uri" = "s3://my-bucket/output/part-{00000..00099}.csv",
+    "s3.access_key" = "xxx",
+    "s3.secret_key" = "xxx",
+    "s3.region" = "us-east-1",
+    "format" = "csv"
+);
+```
+
+### Broker Load Examples
+
+**Load all CSV files matching a pattern:**
+
+```sql
+LOAD LABEL db.label_wildcard
+(
+    DATA INFILE("s3://my-bucket/data/file_*.csv")
+    INTO TABLE my_table
+    COLUMNS TERMINATED BY ","
+    FORMAT AS "CSV"
+    (col1, col2, col3)
+)
+WITH S3 (
+    "provider" = "S3",
+    "AWS_ENDPOINT" = "s3.us-west-2.amazonaws.com",
+    "AWS_ACCESS_KEY" = "xxx",
+    "AWS_SECRET_KEY" = "xxx",
+    "AWS_REGION" = "us-west-2"
+);
+```
+
+**Load files using numeric range expansion:**
+
+```sql
+LOAD LABEL db.label_range
+(
+    DATA INFILE("s3://my-bucket/exports/data_{1..5}.csv")
+    INTO TABLE my_table
+    COLUMNS TERMINATED BY ","
+    FORMAT AS "CSV"
+    (col1, col2, col3)
+)
+WITH S3 (
+    "provider" = "S3",
+    "AWS_ENDPOINT" = "s3.us-west-2.amazonaws.com",
+    "AWS_ACCESS_KEY" = "xxx",
+    "AWS_SECRET_KEY" = "xxx",
+    "AWS_REGION" = "us-west-2"
+);
+```
+
+**Load from HDFS with wildcards:**
+
+```sql
+LOAD LABEL db.label_hdfs_wildcard
+(
+    DATA INFILE("hdfs://namenode:8020/user/data/2024-*/*.csv")
+    INTO TABLE my_table
+    COLUMNS TERMINATED BY ","
+    FORMAT AS "CSV"
+    (col1, col2, col3)
+)
+WITH HDFS (
+    "fs.defaultFS" = "hdfs://namenode:8020",
+    "hadoop.username" = "user"
+);
+```
+
+**Load from HDFS with numeric range:**
+
+```sql
+LOAD LABEL db.label_hdfs_range
+(
+    DATA INFILE("hdfs://namenode:8020/data/file_{1..3,5,7..9}.csv")
+    INTO TABLE my_table
+    COLUMNS TERMINATED BY ","
+    FORMAT AS "CSV"
+    (col1, col2, col3)
+)
+WITH HDFS (
+    "fs.defaultFS" = "hdfs://namenode:8020",
+    "hadoop.username" = "user"
+);
+```
+
+### INSERT INTO SELECT Examples
+
+**Insert from S3 with wildcards:**
+
+```sql
+INSERT INTO my_table (col1, col2, col3)
+SELECT * FROM S3(
+    "uri" = "s3://my-bucket/data/part-*.parquet",
+    "s3.access_key" = "xxx",
+    "s3.secret_key" = "xxx",
+    "s3.region" = "us-east-1",
+    "format" = "parquet"
+);
+```
+
+## Performance Considerations
+
+### Use Specific Prefixes
+
+Doris extracts the longest non-wildcard prefix from your path pattern to 
optimize S3/HDFS listing operations. More specific prefixes result in faster 
file discovery.
+
+```sql
+-- Good: specific prefix reduces listing scope
+"uri" = "s3://bucket/data/2024/01/15/*.csv"
+
+-- Less optimal: broad wildcard at early path segment
+"uri" = "s3://bucket/data/**/file.csv"
+```
+
+### Prefer Range Patterns for Known Sequences
+
+When you know the exact file numbering, use range patterns instead of 
wildcards:
+
+```sql
+-- Better: explicit range
+"uri" = "s3://bucket/data/part-{0001..0100}.csv"
+
+-- Less optimal: wildcard matches unknown files
+"uri" = "s3://bucket/data/part-*.csv"
+```
+
+### Avoid Deep Recursive Wildcards
+
+Deep recursive patterns like `**` can cause slow file listing on large buckets:
+
+```sql
+-- Avoid when possible
+"uri" = "s3://bucket/**/*.csv"
+
+-- Prefer explicit path structure
+"uri" = "s3://bucket/data/year=*/month=*/day=*/*.csv"
+```
+
+## Troubleshooting
+
+| Issue | Cause | Solution |
+|-------|-------|----------|
+| No files found | Pattern doesn't match any files | Verify the path and 
pattern syntax; test with a single file first |
+| Slow file listing | Wildcard too broad or too many files | Use more specific 
prefix; limit wildcard scope |
+| Invalid URI error | Malformed path syntax | Check URI scheme and bucket name 
format |
+| Access denied | Credentials or permissions issue | Verify S3/HDFS 
credentials and bucket policies |
+
+### Testing Path Patterns
+
+Before running a large load job, test your pattern with a limited query:
+
+```sql
+-- Test if files exist and match pattern
+SELECT * FROM S3(
+    "uri" = "s3://bucket/your/pattern/*.csv",
+    ...
+) LIMIT 1;
+```
+
+Use `DESC FUNCTION` to verify the schema of matched files:
+
+```sql
+DESC FUNCTION S3(
+    "uri" = "s3://bucket/your/pattern/*.csv",
+    ...
+);
+```
diff --git a/docs/sql-manual/sql-functions/table-valued-functions/s3.md 
b/docs/sql-manual/sql-functions/table-valued-functions/s3.md
index 07e5e4db552..368b5682181 100644
--- a/docs/sql-manual/sql-functions/table-valued-functions/s3.md
+++ b/docs/sql-manual/sql-functions/table-valued-functions/s3.md
@@ -31,7 +31,7 @@ S3(
 
 | Parameter       | Description                                                
                                                                           |
 
|-----------------|---------------------------------------------------------------------------------------------------------------------------------------|
-| `uri`           | URI for accessing S3. The function will use either Path 
Style or Virtual-hosted Style based on the `use_path_style` parameter        |
+| `uri`           | URI for accessing S3. Supports wildcards and range 
patterns. See [File Path Pattern](../../basic-element/file-path-pattern) for 
details. The function will use either Path Style or Virtual-hosted Style based 
on the `use_path_style` parameter        |
 | `s3.access_key` | Access key for S3                                          
                                                                           |
 | `s3.secret_key` | Secret key for S3                                          
                                                                           |
 | `s3.region`     | S3 region                                                  
                                                                           |
@@ -530,7 +530,7 @@ S3(
 
 - **URI with Wildcards**
 
-  URI can use wildcards to read multiple files. Note: When using wildcards, 
ensure all files have the same format (especially `csv`, `csv_with_names`, 
`csv_with_names_and_types` are different formats). S3 TVF will use the first 
file to parse Table Schema.
+  URI can use wildcards and range patterns to read multiple files. For 
detailed syntax including `*`, `?`, `[...]`, and `{1..10}` range expansion, see 
[File Path Pattern](../../basic-element/file-path-pattern). Note: When using 
wildcards, ensure all files have the same format (especially `csv`, 
`csv_with_names`, `csv_with_names_and_types` are different formats). S3 TVF 
will use the first file to parse Table Schema.
   
   With the following two CSV files:
 
diff --git 
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/data-operate/import/import-way/broker-load-manual.md
 
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/data-operate/import/import-way/broker-load-manual.md
index 81c16f97372..7cf5c87a510 100644
--- 
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/data-operate/import/import-way/broker-load-manual.md
+++ 
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/data-operate/import/import-way/broker-load-manual.md
@@ -23,6 +23,12 @@ Broker Load 适合源数据存储在远程存储系统，比如对象存储或 H
 - HDFS 协议
 - 其他协议（需要相应的 Broker 进程）
 
+支持的文件路径模式：
+
+- 通配符：`*`、`?`、`[abc]`、`[a-z]`
+- 范围展开：`{1..10}`、`{a,b,c}`
+- 完整语法请参阅[文件路径模式](../../../sql-manual/basic-element/file-path-pattern)
+
 支持的数据类型：
 
 - CSV
@@ -551,6 +557,8 @@ Broker Name 只是一个用户自定义名称，不代表 Broker 的类型。
 
 ### 从 HDFS 导入数据，使用通配符匹配两批文件，分别导入到两个表中
 
+  Broker Load 
支持在文件路径中使用通配符（`*`、`?`、`[...]`）和范围模式（`{1..10}`）。详细语法请参阅[文件路径模式](../../../sql-manual/basic-element/file-path-pattern)。
+
   ```sql
   LOAD LABEL example_db.label2
   (
diff --git 
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/data-operate/import/import-way/insert-into-manual.md
 
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/data-operate/import/import-way/insert-into-manual.md
index 7edad457703..4d49f826755 100644
--- 
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/data-operate/import/import-way/insert-into-manual.md
+++ 
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/data-operate/import/import-way/insert-into-manual.md
@@ -308,7 +308,9 @@ INSERT 命令是同步命令，返回成功，即表示导入成功。
 
 ## 通过 TVF 导入数据
 
-通过 Table Value Function 功能，Doris 可以直接将对象存储或 HDFS 上的文件作为 Table 
进行查询分析、并且支持自动的列类型推断、多文件导入。详细介绍，请参考[湖仓一体/TVF文档](https://doris.apache.org/zh-CN/docs/3.0/lakehouse/file-analysis?_highlight=%E9%80%9A%E8%BF%87&_highlight=table&_highlight=value&_highlight=function&_highlight=%E5%8A%9F%E8%83%BD)。
+通过 Table Value Function 功能，Doris 可以直接将对象存储或 HDFS 上的文件作为 Table 
进行查询分析、并且支持自动的列类型推断、多文件导入。详细介绍，请参考[湖仓一体/TVF文档](../../../lakehouse/file-analysis)。
+
+TVF 
支持在文件路径中使用通配符（`*`、`?`、`[...]`）和范围模式（`{1..10}`）。完整语法请参阅[文件路径模式](../../../sql-manual/basic-element/file-path-pattern)。
 
 ### 自动推断文件列类型
 
diff --git 
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/lakehouse/file-analysis.md 
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/lakehouse/file-analysis.md
index 7b5cfc61eaf..900942cdb14 100644
--- 
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/lakehouse/file-analysis.md
+++ 
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/lakehouse/file-analysis.md
@@ -39,21 +39,15 @@ TVF 的属性包括要分析的文件路径，文件格式、对象存储的连
 
 ### 多文件导入
 
-在导入时，文件路径（URI）支持使用通配符进行匹配。Doris 
的文件路径匹配采用[Glob匹配模式](https://en.wikipedia.org/wiki/Glob_(programming)#:~:text=glob%20%28%29%20%28%2F%20%C9%A1l%C9%92b%20%2F%29%20is%20a%20libc,into%20a%20list%20of%20names%20matching%20that%20pattern.)，并在此基础上进行了一些扩展，支持更灵活的文件选择方式。
+文件路径（URI）支持使用通配符和范围模式匹配多个文件：
 
-- `file_{1..3}`：匹配文件`file_1`、`file_2`、`file_3`
-- `file_{1,3}_{1,2}`：匹配文件`file_1_1`、`file_1_2`、`file_3_1`、`file_1_2` 
（支持和`{n..m}`方式混用，用逗号隔开）
-- `file_*`：匹配所有`file_`开头的文件
-- `*.parquet`：匹配所有`.parquet`后缀的文件
-- `tvf_test/*`：匹配`tvf_test`目录下的所有文件
-- `*test*`：匹配文件名中包含 `test`的文件
+| 模式 | 示例 | 匹配 |
+|------|------|------|
+| `*` | `file_*` | 所有以 `file_` 开头的文件 |
+| `{n..m}` | `file_{1..3}` | `file_1`、`file_2`、`file_3` |
+| `{a,b,c}` | `file_{a,b}` | `file_a`、`file_b` |
 
-**注意**
-
-- `{1..3}`的写法中顺序可以颠倒，`{3..1}`也是可以的。
-- 
`file_{-1..2}`、`file_{a..4}`这种写法不符合规定，不支持使用负数或者字母作为枚举端点，但是`file_{1..3,11}`是允许的，会匹配到文件`file_1`、`file_2`、`file_3`、`file_11`。
-- 
doris尽量让能够导入的文件导入成功，如果是`file_{a..b,-1..3,4..5}`这样包含了错误写法的路径，我们会匹配到文件`file_4`和`file_5`。
-- `{1..4,5}`采用逗号添加的只允许是数字，而不允许如`{1..4,a}`这样的写法，后者会忽略掉`{a}`
+完整语法包括所有支持的通配符、范围展开规则和使用示例，请参阅[文件路径模式](../sql-manual/basic-element/file-path-pattern)。
 
 
 ### 自动推断文件列类型
diff --git 
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/basic-element/file-path-pattern.md
 
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/basic-element/file-path-pattern.md
new file mode 100644
index 00000000000..060f49a00eb
--- /dev/null
+++ 
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/basic-element/file-path-pattern.md
@@ -0,0 +1,309 @@
+---
+{
+    "title": "文件路径模式",
+    "language": "zh-CN",
+    "description": "Doris 访问远程存储系统（如 S3、HDFS 和其他对象存储）中文件时支持的文件路径模式和通配符。"
+}
+---
+
+## 描述
+
+当访问远程存储系统（S3、HDFS 和其他 S3 兼容的对象存储）中的文件时，Doris 
支持灵活的文件路径模式，包括通配符和范围表达式。本文档描述了支持的路径格式和模式匹配语法。
+
+以下功能支持这些路径模式：
+- [S3 TVF](../sql-functions/table-valued-functions/s3)
+- [HDFS TVF](../sql-functions/table-valued-functions/hdfs)
+- [Broker Load](../../data-operate/import/import-way/broker-load-manual)
+- INSERT INTO SELECT（从 TVF 导入）
+
+## 支持的 URI 格式
+
+### S3 风格 URI
+
+| 风格 | 格式 | 示例 |
+|------|------|------|
+| AWS Client 风格（Hadoop S3） | `s3://bucket/path/to/file` | 
`s3://my-bucket/data/file.csv` |
+| S3A 风格 | `s3a://bucket/path/to/file` | `s3a://my-bucket/data/file.csv` |
+| S3N 风格 | `s3n://bucket/path/to/file` | `s3n://my-bucket/data/file.csv` |
+| Virtual Host 风格 | `https://bucket.endpoint/path/to/file` | 
`https://my-bucket.s3.us-west-1.amazonaws.com/data/file.csv` |
+| Path 风格 | `https://endpoint/bucket/path/to/file` | 
`https://s3.us-west-1.amazonaws.com/my-bucket/data/file.csv` |
+
+### 其他云存储 URI
+
+| 云服务商 | 协议 | 示例 |
+|----------|------|------|
+| 阿里云 OSS | `oss://` | `oss://my-bucket/data/file.csv` |
+| 腾讯云 COS | `cos://`, `cosn://` | `cos://my-bucket/data/file.csv` |
+| 百度云 BOS | `bos://` | `bos://my-bucket/data/file.csv` |
+| 华为云 OBS | `obs://` | `obs://my-bucket/data/file.csv` |
+| Google Cloud Storage | `gs://` | `gs://my-bucket/data/file.csv` |
+| Azure Blob Storage | `azure://` | `azure://container/data/file.csv` |
+
+### HDFS URI
+
+| 风格 | 格式 | 示例 |
+|------|------|------|
+| 标准格式 | `hdfs://namenode:port/path/to/file` | 
`hdfs://namenode:8020/user/data/file.csv` |
+| HA 模式 | `hdfs://nameservice/path/to/file` | 
`hdfs://my-ha-cluster/user/data/file.csv` |
+
+## 通配符模式
+
+Doris 使用 glob 风格的模式匹配来匹配文件路径。支持以下通配符：
+
+### 基本通配符
+
+| 模式 | 说明 | 示例 | 匹配 |
+|------|------|------|------|
+| `*` | 匹配路径段内的零个或多个字符 | `*.csv` | `file.csv`、`data.csv`、`a.csv` |
+| `?` | 匹配恰好一个字符 | `file?.csv` | `file1.csv`、`fileA.csv`，但不匹配 `file10.csv` |
+| `[abc]` | 匹配括号内的任意单个字符 | `file[123].csv` | 
`file1.csv`、`file2.csv`、`file3.csv` |
+| `[a-z]` | 匹配范围内的任意单个字符 | `file[a-c].csv` | 
`filea.csv`、`fileb.csv`、`filec.csv` |
+| `[!abc]` | 匹配不在括号内的任意单个字符 | `file[!0-9].csv` | `filea.csv`、`fileb.csv`，但不匹配 
`file1.csv` |
+
+### 范围展开（花括号模式）
+
+Doris 支持使用花括号模式 `{start..end}` 进行数字范围展开：
+
+| 模式 | 展开结果 | 匹配 |
+|------|----------|------|
+| `{1..3}` | `{1,2,3}` | `1`、`2`、`3` |
+| `{01..05}` | `{1,2,3,4,5}` | `1`、`2`、`3`、`4`、`5`（前导零不会保留） |
+| `{3..1}` | `{1,2,3}` | `1`、`2`、`3`（支持逆序范围） |
+| `{a,b,c}` | `{a,b,c}` | `a`、`b`、`c`（枚举） |
+| `{1..3,5,7..9}` | `{1,2,3,5,7,8,9}` | 混合范围和值 |
+
+:::caution 注意
+- Doris 
尽量让能够导入的文件导入成功。花括号表达式中无效的部分会被静默跳过，有效部分仍会正常展开。例如，`file_{a..b,-1..3,4..5}` 会匹配到 
`file_4` 和 `file_5`（无效的 `a..b` 和负数范围 `-1..3` 被跳过，但 `4..5` 正常展开）。
+- 如果整个范围包含负数（如 `{-1..2}`），该范围会被跳过。如果与有效范围混用（如 `{-1..2,1..3}`），只有有效范围 `1..3` 
会被展开。
+- 使用逗号与范围混用时，只允许添加数字。例如 `{1..4,a}` 中，非数字的 `a` 会被忽略，结果为 `{1,2,3,4}`。
+- 纯枚举模式如 `{a,b,c}`（不含 `..` 范围）会直接传递给 glob 匹配，可以正常工作。
+:::
+
+### 组合模式
+
+可以在单个路径中组合多个模式：
+
+```
+s3://bucket/data_{1..3}/file_*.csv
+```
+
+这将匹配：
+- `s3://bucket/data_1/file_a.csv`
+- `s3://bucket/data_1/file_b.csv`
+- `s3://bucket/data_2/file_a.csv`
+- 等等
+
+## 示例
+
+### S3 TVF 示例
+
+**匹配目录中的所有 CSV 文件：**
+
+```sql
+SELECT * FROM S3(
+    "uri" = "s3://my-bucket/data/*.csv",
+    "s3.access_key" = "xxx",
+    "s3.secret_key" = "xxx",
+    "s3.region" = "us-east-1",
+    "format" = "csv"
+);
+```
+
+**使用数字范围匹配文件：**
+
+```sql
+SELECT * FROM S3(
+    "uri" = "s3://my-bucket/logs/data_{1..10}.csv",
+    "s3.access_key" = "xxx",
+    "s3.secret_key" = "xxx",
+    "s3.region" = "us-east-1",
+    "format" = "csv"
+);
+```
+
+**匹配日期分区目录中的文件：**
+
+```sql
+SELECT * FROM S3(
+    "uri" = "s3://my-bucket/logs/year=2024/month=*/day=*/data.parquet",
+    "s3.access_key" = "xxx",
+    "s3.secret_key" = "xxx",
+    "s3.region" = "us-east-1",
+    "format" = "parquet"
+);
+```
+
+:::caution 零填充目录
+对于零填充的目录名如 `month=01`、`month=02`，请使用通配符（`*`）而不是范围模式。模式 `{01..12}` 会展开为 
`{1,2,...,12}`，无法匹配 `month=01`。
+:::
+
+**匹配编号的文件分片（如 Spark 输出）：**
+
+```sql
+SELECT * FROM S3(
+    "uri" = "s3://my-bucket/output/part-{00000..00099}.csv",
+    "s3.access_key" = "xxx",
+    "s3.secret_key" = "xxx",
+    "s3.region" = "us-east-1",
+    "format" = "csv"
+);
+```
+
+### Broker Load 示例
+
+**加载匹配模式的所有 CSV 文件：**
+
+```sql
+LOAD LABEL db.label_wildcard
+(
+    DATA INFILE("s3://my-bucket/data/file_*.csv")
+    INTO TABLE my_table
+    COLUMNS TERMINATED BY ","
+    FORMAT AS "CSV"
+    (col1, col2, col3)
+)
+WITH S3 (
+    "provider" = "S3",
+    "AWS_ENDPOINT" = "s3.us-west-2.amazonaws.com",
+    "AWS_ACCESS_KEY" = "xxx",
+    "AWS_SECRET_KEY" = "xxx",
+    "AWS_REGION" = "us-west-2"
+);
+```
+
+**使用数字范围展开加载文件：**
+
+```sql
+LOAD LABEL db.label_range
+(
+    DATA INFILE("s3://my-bucket/exports/data_{1..5}.csv")
+    INTO TABLE my_table
+    COLUMNS TERMINATED BY ","
+    FORMAT AS "CSV"
+    (col1, col2, col3)
+)
+WITH S3 (
+    "provider" = "S3",
+    "AWS_ENDPOINT" = "s3.us-west-2.amazonaws.com",
+    "AWS_ACCESS_KEY" = "xxx",
+    "AWS_SECRET_KEY" = "xxx",
+    "AWS_REGION" = "us-west-2"
+);
+```
+
+**从 HDFS 加载，使用通配符：**
+
+```sql
+LOAD LABEL db.label_hdfs_wildcard
+(
+    DATA INFILE("hdfs://namenode:8020/user/data/2024-*/*.csv")
+    INTO TABLE my_table
+    COLUMNS TERMINATED BY ","
+    FORMAT AS "CSV"
+    (col1, col2, col3)
+)
+WITH HDFS (
+    "fs.defaultFS" = "hdfs://namenode:8020",
+    "hadoop.username" = "user"
+);
+```
+
+**从 HDFS 加载，使用数字范围：**
+
+```sql
+LOAD LABEL db.label_hdfs_range
+(
+    DATA INFILE("hdfs://namenode:8020/data/file_{1..3,5,7..9}.csv")
+    INTO TABLE my_table
+    COLUMNS TERMINATED BY ","
+    FORMAT AS "CSV"
+    (col1, col2, col3)
+)
+WITH HDFS (
+    "fs.defaultFS" = "hdfs://namenode:8020",
+    "hadoop.username" = "user"
+);
+```
+
+### INSERT INTO SELECT 示例
+
+**使用通配符从 S3 插入数据：**
+
+```sql
+INSERT INTO my_table (col1, col2, col3)
+SELECT * FROM S3(
+    "uri" = "s3://my-bucket/data/part-*.parquet",
+    "s3.access_key" = "xxx",
+    "s3.secret_key" = "xxx",
+    "s3.region" = "us-east-1",
+    "format" = "parquet"
+);
+```
+
+## 性能优化建议
+
+### 使用具体的前缀
+
+Doris 会从路径模式中提取最长的非通配符前缀，以优化 S3/HDFS 的列表操作。更具体的前缀可以加快文件发现速度。
+
+```sql
+-- 推荐：具体的前缀减少列表范围
+"uri" = "s3://bucket/data/2024/01/15/*.csv"
+
+-- 不推荐：在早期路径段使用广泛的通配符
+"uri" = "s3://bucket/data/**/file.csv"
+```
+
+### 对已知序列优先使用范围模式
+
+当您知道确切的文件编号时，使用范围模式而不是通配符：
+
+```sql
+-- 更好：显式范围
+"uri" = "s3://bucket/data/part-{0001..0100}.csv"
+
+-- 较差：通配符匹配未知文件
+"uri" = "s3://bucket/data/part-*.csv"
+```
+
+### 避免深层递归通配符
+
+深层递归模式如 `**` 可能导致大型存储桶上的文件列表速度变慢：
+
+```sql
+-- 尽量避免
+"uri" = "s3://bucket/**/*.csv"
+
+-- 优先使用显式路径结构
+"uri" = "s3://bucket/data/year=*/month=*/day=*/*.csv"
+```
+
+## 故障排查
+
+| 问题 | 原因 | 解决方案 |
+|------|------|----------|
+| 未找到文件 | 模式不匹配任何文件 | 验证路径和模式语法；先用单个文件测试 |
+| 文件列表缓慢 | 通配符范围太广或文件太多 | 使用更具体的前缀；限制通配符范围 |
+| URI 无效错误 | 路径语法格式错误 | 检查 URI 协议和存储桶名称格式 |
+| 访问被拒绝 | 凭证或权限问题 | 验证 S3/HDFS 凭证和存储桶策略 |
+
+### 测试路径模式
+
+在运行大型加载作业之前，先用有限的查询测试您的模式：
+
+```sql
+-- 测试文件是否存在并匹配模式
+SELECT * FROM S3(
+    "uri" = "s3://bucket/your/pattern/*.csv",
+    ...
+) LIMIT 1;
+```
+
+使用 `DESC FUNCTION` 验证匹配文件的 schema：
+
+```sql
+DESC FUNCTION S3(
+    "uri" = "s3://bucket/your/pattern/*.csv",
+    ...
+);
+```
diff --git 
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/sql-functions/table-valued-functions/s3.md
 
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/sql-functions/table-valued-functions/s3.md
index 1f562af25f1..87728b28441 100644
--- 
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/sql-functions/table-valued-functions/s3.md
+++ 
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/sql-manual/sql-functions/table-valued-functions/s3.md
@@ -31,7 +31,7 @@ S3(
 
 | 参数              | 描述                                                         
                                                                  |
 
|-------------------|--------------------------------------------------------------------------------------------------------------------------------|
-| `uri`               | 用于访问 S3 的 URI，该函数会根据 `use_path_style` 参数来决定使用路径样式（Path 
Style）还是虚拟托管样式（Virtual-hosted Style）进行访问 |
+| `uri`               | 用于访问 S3 的 
URI，支持通配符和范围模式。详见[文件路径模式](../../basic-element/file-path-pattern)。该函数会根据 
`use_path_style` 参数来决定使用路径样式（Path Style）还是虚拟托管样式（Virtual-hosted Style）进行访问 |
 | `s3.access_key`     | 访问 S3 的访问密钥                                            
                                                                |
 | `s3.secret_key`     | 访问 S3 的秘密密钥                                            
                                                                |
 | `s3.region`         | S3 存储所在的区域                                             
                                                                |
@@ -530,7 +530,7 @@ S3(
 
 - **URI 包含通配符**
 
-  URI 可以使用通配符来读取多个文件。注意：如果使用通配符，需要保证各个文件的格式一致（尤其是 
`csv`、`csv_with_names`、`csv_with_names_and_types` 属于不同的格式），S3 TVF 会使用第一个文件来解析 
Table Schema。
+  URI 可以使用通配符和范围模式来读取多个文件。详细语法包括 `*`、`?`、`[...]` 以及 `{1..10}` 
范围展开，请参阅[文件路径模式](../../basic-element/file-path-pattern)。注意：如果使用通配符，需要保证各个文件的格式一致（尤其是
 `csv`、`csv_with_names`、`csv_with_names_and_types` 属于不同的格式），S3 TVF 会使用第一个文件来解析 
Table Schema。
   
   如下两个 CSV 文件：
 
diff --git 
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/data-operate/import/import-way/broker-load-manual.md
 
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/data-operate/import/import-way/broker-load-manual.md
index 81c16f97372..7cf5c87a510 100644
--- 
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/data-operate/import/import-way/broker-load-manual.md
+++ 
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/data-operate/import/import-way/broker-load-manual.md
@@ -23,6 +23,12 @@ Broker Load 适合源数据存储在远程存储系统，比如对象存储或 H
 - HDFS 协议
 - 其他协议（需要相应的 Broker 进程）
 
+支持的文件路径模式：
+
+- 通配符：`*`、`?`、`[abc]`、`[a-z]`
+- 范围展开：`{1..10}`、`{a,b,c}`
+- 完整语法请参阅[文件路径模式](../../../sql-manual/basic-element/file-path-pattern)
+
 支持的数据类型：
 
 - CSV
@@ -551,6 +557,8 @@ Broker Name 只是一个用户自定义名称，不代表 Broker 的类型。
 
 ### 从 HDFS 导入数据，使用通配符匹配两批文件，分别导入到两个表中
 
+  Broker Load 
支持在文件路径中使用通配符（`*`、`?`、`[...]`）和范围模式（`{1..10}`）。详细语法请参阅[文件路径模式](../../../sql-manual/basic-element/file-path-pattern)。
+
   ```sql
   LOAD LABEL example_db.label2
   (
diff --git 
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/data-operate/import/import-way/insert-into-manual.md
 
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/data-operate/import/import-way/insert-into-manual.md
index 7edad457703..4d49f826755 100644
--- 
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/data-operate/import/import-way/insert-into-manual.md
+++ 
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/data-operate/import/import-way/insert-into-manual.md
@@ -308,7 +308,9 @@ INSERT 命令是同步命令，返回成功，即表示导入成功。
 
 ## 通过 TVF 导入数据
 
-通过 Table Value Function 功能，Doris 可以直接将对象存储或 HDFS 上的文件作为 Table 
进行查询分析、并且支持自动的列类型推断、多文件导入。详细介绍，请参考[湖仓一体/TVF文档](https://doris.apache.org/zh-CN/docs/3.0/lakehouse/file-analysis?_highlight=%E9%80%9A%E8%BF%87&_highlight=table&_highlight=value&_highlight=function&_highlight=%E5%8A%9F%E8%83%BD)。
+通过 Table Value Function 功能，Doris 可以直接将对象存储或 HDFS 上的文件作为 Table 
进行查询分析、并且支持自动的列类型推断、多文件导入。详细介绍，请参考[湖仓一体/TVF文档](../../../lakehouse/file-analysis)。
+
+TVF 
支持在文件路径中使用通配符（`*`、`?`、`[...]`）和范围模式（`{1..10}`）。完整语法请参阅[文件路径模式](../../../sql-manual/basic-element/file-path-pattern)。
 
 ### 自动推断文件列类型
 
diff --git 
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/lakehouse/file-analysis.md
 
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/lakehouse/file-analysis.md
index 7b5cfc61eaf..900942cdb14 100644
--- 
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/lakehouse/file-analysis.md
+++ 
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/lakehouse/file-analysis.md
@@ -39,21 +39,15 @@ TVF 的属性包括要分析的文件路径，文件格式、对象存储的连
 
 ### 多文件导入
 
-在导入时，文件路径（URI）支持使用通配符进行匹配。Doris 
的文件路径匹配采用[Glob匹配模式](https://en.wikipedia.org/wiki/Glob_(programming)#:~:text=glob%20%28%29%20%28%2F%20%C9%A1l%C9%92b%20%2F%29%20is%20a%20libc,into%20a%20list%20of%20names%20matching%20that%20pattern.)，并在此基础上进行了一些扩展，支持更灵活的文件选择方式。
+文件路径（URI）支持使用通配符和范围模式匹配多个文件：
 
-- `file_{1..3}`：匹配文件`file_1`、`file_2`、`file_3`
-- `file_{1,3}_{1,2}`：匹配文件`file_1_1`、`file_1_2`、`file_3_1`、`file_1_2` 
（支持和`{n..m}`方式混用，用逗号隔开）
-- `file_*`：匹配所有`file_`开头的文件
-- `*.parquet`：匹配所有`.parquet`后缀的文件
-- `tvf_test/*`：匹配`tvf_test`目录下的所有文件
-- `*test*`：匹配文件名中包含 `test`的文件
+| 模式 | 示例 | 匹配 |
+|------|------|------|
+| `*` | `file_*` | 所有以 `file_` 开头的文件 |
+| `{n..m}` | `file_{1..3}` | `file_1`、`file_2`、`file_3` |
+| `{a,b,c}` | `file_{a,b}` | `file_a`、`file_b` |
 
-**注意**
-
-- `{1..3}`的写法中顺序可以颠倒，`{3..1}`也是可以的。
-- 
`file_{-1..2}`、`file_{a..4}`这种写法不符合规定，不支持使用负数或者字母作为枚举端点，但是`file_{1..3,11}`是允许的，会匹配到文件`file_1`、`file_2`、`file_3`、`file_11`。
-- 
doris尽量让能够导入的文件导入成功，如果是`file_{a..b,-1..3,4..5}`这样包含了错误写法的路径，我们会匹配到文件`file_4`和`file_5`。
-- `{1..4,5}`采用逗号添加的只允许是数字，而不允许如`{1..4,a}`这样的写法，后者会忽略掉`{a}`
+完整语法包括所有支持的通配符、范围展开规则和使用示例，请参阅[文件路径模式](../sql-manual/basic-element/file-path-pattern)。
 
 
 ### 自动推断文件列类型
diff --git 
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/sql-manual/basic-element/file-path-pattern.md
 
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/sql-manual/basic-element/file-path-pattern.md
new file mode 100644
index 00000000000..060f49a00eb
--- /dev/null
+++ 
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/sql-manual/basic-element/file-path-pattern.md
@@ -0,0 +1,309 @@
+---
+{
+    "title": "文件路径模式",
+    "language": "zh-CN",
+    "description": "Doris 访问远程存储系统（如 S3、HDFS 和其他对象存储）中文件时支持的文件路径模式和通配符。"
+}
+---
+
+## 描述
+
+当访问远程存储系统（S3、HDFS 和其他 S3 兼容的对象存储）中的文件时，Doris 
支持灵活的文件路径模式，包括通配符和范围表达式。本文档描述了支持的路径格式和模式匹配语法。
+
+以下功能支持这些路径模式：
+- [S3 TVF](../sql-functions/table-valued-functions/s3)
+- [HDFS TVF](../sql-functions/table-valued-functions/hdfs)
+- [Broker Load](../../data-operate/import/import-way/broker-load-manual)
+- INSERT INTO SELECT（从 TVF 导入）
+
+## 支持的 URI 格式
+
+### S3 风格 URI
+
+| 风格 | 格式 | 示例 |
+|------|------|------|
+| AWS Client 风格（Hadoop S3） | `s3://bucket/path/to/file` | 
`s3://my-bucket/data/file.csv` |
+| S3A 风格 | `s3a://bucket/path/to/file` | `s3a://my-bucket/data/file.csv` |
+| S3N 风格 | `s3n://bucket/path/to/file` | `s3n://my-bucket/data/file.csv` |
+| Virtual Host 风格 | `https://bucket.endpoint/path/to/file` | 
`https://my-bucket.s3.us-west-1.amazonaws.com/data/file.csv` |
+| Path 风格 | `https://endpoint/bucket/path/to/file` | 
`https://s3.us-west-1.amazonaws.com/my-bucket/data/file.csv` |
+
+### 其他云存储 URI
+
+| 云服务商 | 协议 | 示例 |
+|----------|------|------|
+| 阿里云 OSS | `oss://` | `oss://my-bucket/data/file.csv` |
+| 腾讯云 COS | `cos://`, `cosn://` | `cos://my-bucket/data/file.csv` |
+| 百度云 BOS | `bos://` | `bos://my-bucket/data/file.csv` |
+| 华为云 OBS | `obs://` | `obs://my-bucket/data/file.csv` |
+| Google Cloud Storage | `gs://` | `gs://my-bucket/data/file.csv` |
+| Azure Blob Storage | `azure://` | `azure://container/data/file.csv` |
+
+### HDFS URI
+
+| 风格 | 格式 | 示例 |
+|------|------|------|
+| 标准格式 | `hdfs://namenode:port/path/to/file` | 
`hdfs://namenode:8020/user/data/file.csv` |
+| HA 模式 | `hdfs://nameservice/path/to/file` | 
`hdfs://my-ha-cluster/user/data/file.csv` |
+
+## 通配符模式
+
+Doris 使用 glob 风格的模式匹配来匹配文件路径。支持以下通配符：
+
+### 基本通配符
+
+| 模式 | 说明 | 示例 | 匹配 |
+|------|------|------|------|
+| `*` | 匹配路径段内的零个或多个字符 | `*.csv` | `file.csv`、`data.csv`、`a.csv` |
+| `?` | 匹配恰好一个字符 | `file?.csv` | `file1.csv`、`fileA.csv`，但不匹配 `file10.csv` |
+| `[abc]` | 匹配括号内的任意单个字符 | `file[123].csv` | 
`file1.csv`、`file2.csv`、`file3.csv` |
+| `[a-z]` | 匹配范围内的任意单个字符 | `file[a-c].csv` | 
`filea.csv`、`fileb.csv`、`filec.csv` |
+| `[!abc]` | 匹配不在括号内的任意单个字符 | `file[!0-9].csv` | `filea.csv`、`fileb.csv`，但不匹配 
`file1.csv` |
+
+### 范围展开（花括号模式）
+
+Doris 支持使用花括号模式 `{start..end}` 进行数字范围展开：
+
+| 模式 | 展开结果 | 匹配 |
+|------|----------|------|
+| `{1..3}` | `{1,2,3}` | `1`、`2`、`3` |
+| `{01..05}` | `{1,2,3,4,5}` | `1`、`2`、`3`、`4`、`5`（前导零不会保留） |
+| `{3..1}` | `{1,2,3}` | `1`、`2`、`3`（支持逆序范围） |
+| `{a,b,c}` | `{a,b,c}` | `a`、`b`、`c`（枚举） |
+| `{1..3,5,7..9}` | `{1,2,3,5,7,8,9}` | 混合范围和值 |
+
+:::caution 注意
+- Doris 
尽量让能够导入的文件导入成功。花括号表达式中无效的部分会被静默跳过，有效部分仍会正常展开。例如，`file_{a..b,-1..3,4..5}` 会匹配到 
`file_4` 和 `file_5`（无效的 `a..b` 和负数范围 `-1..3` 被跳过，但 `4..5` 正常展开）。
+- 如果整个范围包含负数（如 `{-1..2}`），该范围会被跳过。如果与有效范围混用（如 `{-1..2,1..3}`），只有有效范围 `1..3` 
会被展开。
+- 使用逗号与范围混用时，只允许添加数字。例如 `{1..4,a}` 中，非数字的 `a` 会被忽略，结果为 `{1,2,3,4}`。
+- 纯枚举模式如 `{a,b,c}`（不含 `..` 范围）会直接传递给 glob 匹配，可以正常工作。
+:::
+
+### 组合模式
+
+可以在单个路径中组合多个模式：
+
+```
+s3://bucket/data_{1..3}/file_*.csv
+```
+
+这将匹配：
+- `s3://bucket/data_1/file_a.csv`
+- `s3://bucket/data_1/file_b.csv`
+- `s3://bucket/data_2/file_a.csv`
+- 等等
+
+## 示例
+
+### S3 TVF 示例
+
+**匹配目录中的所有 CSV 文件：**
+
+```sql
+SELECT * FROM S3(
+    "uri" = "s3://my-bucket/data/*.csv",
+    "s3.access_key" = "xxx",
+    "s3.secret_key" = "xxx",
+    "s3.region" = "us-east-1",
+    "format" = "csv"
+);
+```
+
+**使用数字范围匹配文件：**
+
+```sql
+SELECT * FROM S3(
+    "uri" = "s3://my-bucket/logs/data_{1..10}.csv",
+    "s3.access_key" = "xxx",
+    "s3.secret_key" = "xxx",
+    "s3.region" = "us-east-1",
+    "format" = "csv"
+);
+```
+
+**匹配日期分区目录中的文件：**
+
+```sql
+SELECT * FROM S3(
+    "uri" = "s3://my-bucket/logs/year=2024/month=*/day=*/data.parquet",
+    "s3.access_key" = "xxx",
+    "s3.secret_key" = "xxx",
+    "s3.region" = "us-east-1",
+    "format" = "parquet"
+);
+```
+
+:::caution 零填充目录
+对于零填充的目录名如 `month=01`、`month=02`，请使用通配符（`*`）而不是范围模式。模式 `{01..12}` 会展开为 
`{1,2,...,12}`，无法匹配 `month=01`。
+:::
+
+**匹配编号的文件分片（如 Spark 输出）：**
+
+```sql
+SELECT * FROM S3(
+    "uri" = "s3://my-bucket/output/part-{00000..00099}.csv",
+    "s3.access_key" = "xxx",
+    "s3.secret_key" = "xxx",
+    "s3.region" = "us-east-1",
+    "format" = "csv"
+);
+```
+
+### Broker Load 示例
+
+**加载匹配模式的所有 CSV 文件：**
+
+```sql
+LOAD LABEL db.label_wildcard
+(
+    DATA INFILE("s3://my-bucket/data/file_*.csv")
+    INTO TABLE my_table
+    COLUMNS TERMINATED BY ","
+    FORMAT AS "CSV"
+    (col1, col2, col3)
+)
+WITH S3 (
+    "provider" = "S3",
+    "AWS_ENDPOINT" = "s3.us-west-2.amazonaws.com",
+    "AWS_ACCESS_KEY" = "xxx",
+    "AWS_SECRET_KEY" = "xxx",
+    "AWS_REGION" = "us-west-2"
+);
+```
+
+**使用数字范围展开加载文件：**
+
+```sql
+LOAD LABEL db.label_range
+(
+    DATA INFILE("s3://my-bucket/exports/data_{1..5}.csv")
+    INTO TABLE my_table
+    COLUMNS TERMINATED BY ","
+    FORMAT AS "CSV"
+    (col1, col2, col3)
+)
+WITH S3 (
+    "provider" = "S3",
+    "AWS_ENDPOINT" = "s3.us-west-2.amazonaws.com",
+    "AWS_ACCESS_KEY" = "xxx",
+    "AWS_SECRET_KEY" = "xxx",
+    "AWS_REGION" = "us-west-2"
+);
+```
+
+**从 HDFS 加载，使用通配符：**
+
+```sql
+LOAD LABEL db.label_hdfs_wildcard
+(
+    DATA INFILE("hdfs://namenode:8020/user/data/2024-*/*.csv")
+    INTO TABLE my_table
+    COLUMNS TERMINATED BY ","
+    FORMAT AS "CSV"
+    (col1, col2, col3)
+)
+WITH HDFS (
+    "fs.defaultFS" = "hdfs://namenode:8020",
+    "hadoop.username" = "user"
+);
+```
+
+**从 HDFS 加载，使用数字范围：**
+
+```sql
+LOAD LABEL db.label_hdfs_range
+(
+    DATA INFILE("hdfs://namenode:8020/data/file_{1..3,5,7..9}.csv")
+    INTO TABLE my_table
+    COLUMNS TERMINATED BY ","
+    FORMAT AS "CSV"
+    (col1, col2, col3)
+)
+WITH HDFS (
+    "fs.defaultFS" = "hdfs://namenode:8020",
+    "hadoop.username" = "user"
+);
+```
+
+### INSERT INTO SELECT 示例
+
+**使用通配符从 S3 插入数据：**
+
+```sql
+INSERT INTO my_table (col1, col2, col3)
+SELECT * FROM S3(
+    "uri" = "s3://my-bucket/data/part-*.parquet",
+    "s3.access_key" = "xxx",
+    "s3.secret_key" = "xxx",
+    "s3.region" = "us-east-1",
+    "format" = "parquet"
+);
+```
+
+## 性能优化建议
+
+### 使用具体的前缀
+
+Doris 会从路径模式中提取最长的非通配符前缀，以优化 S3/HDFS 的列表操作。更具体的前缀可以加快文件发现速度。
+
+```sql
+-- 推荐：具体的前缀减少列表范围
+"uri" = "s3://bucket/data/2024/01/15/*.csv"
+
+-- 不推荐：在早期路径段使用广泛的通配符
+"uri" = "s3://bucket/data/**/file.csv"
+```
+
+### 对已知序列优先使用范围模式
+
+当您知道确切的文件编号时，使用范围模式而不是通配符：
+
+```sql
+-- 更好：显式范围
+"uri" = "s3://bucket/data/part-{0001..0100}.csv"
+
+-- 较差：通配符匹配未知文件
+"uri" = "s3://bucket/data/part-*.csv"
+```
+
+### 避免深层递归通配符
+
+深层递归模式如 `**` 可能导致大型存储桶上的文件列表速度变慢：
+
+```sql
+-- 尽量避免
+"uri" = "s3://bucket/**/*.csv"
+
+-- 优先使用显式路径结构
+"uri" = "s3://bucket/data/year=*/month=*/day=*/*.csv"
+```
+
+## 故障排查
+
+| 问题 | 原因 | 解决方案 |
+|------|------|----------|
+| 未找到文件 | 模式不匹配任何文件 | 验证路径和模式语法；先用单个文件测试 |
+| 文件列表缓慢 | 通配符范围太广或文件太多 | 使用更具体的前缀；限制通配符范围 |
+| URI 无效错误 | 路径语法格式错误 | 检查 URI 协议和存储桶名称格式 |
+| 访问被拒绝 | 凭证或权限问题 | 验证 S3/HDFS 凭证和存储桶策略 |
+
+### 测试路径模式
+
+在运行大型加载作业之前，先用有限的查询测试您的模式：
+
+```sql
+-- 测试文件是否存在并匹配模式
+SELECT * FROM S3(
+    "uri" = "s3://bucket/your/pattern/*.csv",
+    ...
+) LIMIT 1;
+```
+
+使用 `DESC FUNCTION` 验证匹配文件的 schema：
+
+```sql
+DESC FUNCTION S3(
+    "uri" = "s3://bucket/your/pattern/*.csv",
+    ...
+);
+```
diff --git 
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/sql-manual/sql-functions/table-valued-functions/s3.md
 
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/sql-manual/sql-functions/table-valued-functions/s3.md
index 866f4cd01ec..87728b28441 100644
--- 
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/sql-manual/sql-functions/table-valued-functions/s3.md
+++ 
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/sql-manual/sql-functions/table-valued-functions/s3.md
@@ -31,7 +31,7 @@ S3(
 
 | 参数              | 描述                                                         
                                                                  |
 
|-------------------|--------------------------------------------------------------------------------------------------------------------------------|
-| `uri`               | 用于访问 S3 的 URI，该函数会根据 `use_path_style` 参数来决定使用路径样式（Path 
Style）还是虚拟托管样式（Virtual-hosted Style）进行访问 |
+| `uri`               | 用于访问 S3 的 
URI，支持通配符和范围模式。详见[文件路径模式](../../basic-element/file-path-pattern)。该函数会根据 
`use_path_style` 参数来决定使用路径样式（Path Style）还是虚拟托管样式（Virtual-hosted Style）进行访问 |
 | `s3.access_key`     | 访问 S3 的访问密钥                                            
                                                                |
 | `s3.secret_key`     | 访问 S3 的秘密密钥                                            
                                                                |
 | `s3.region`         | S3 存储所在的区域                                             
                                                                |
@@ -62,7 +62,6 @@ S3(
 |`enable_mapping_varbinary`| `false` | 在读取 PARQUET/ORC 时将 BYTE_ARRAY 类型映射为 
STRING，开启后则映射为 VARBINARY 类型。| 4.0.3 版本开始支持 |
 |`enable_mapping_timestamp_tz`|默认为 false，在读取 PARQUET(TIMESTAMP WITH 
isAdjustedToUTC) \ ORC(TIMESTAMP_INSTANT) 类型映射为 DATATIME，开启后则会映射到 TIMESTAMPTZ 
类型。| 在 4.0.3 之后开始支持 |
 
-
 ## 注意事项
 
 1. 对于 AWS S3，标准 URI 样式有以下几种：
@@ -531,7 +530,7 @@ S3(
 
 - **URI 包含通配符**
 
-  URI 可以使用通配符来读取多个文件。注意：如果使用通配符，需要保证各个文件的格式一致（尤其是 
`csv`、`csv_with_names`、`csv_with_names_and_types` 属于不同的格式），S3 TVF 会使用第一个文件来解析 
Table Schema。
+  URI 可以使用通配符和范围模式来读取多个文件。详细语法包括 `*`、`?`、`[...]` 以及 `{1..10}` 
范围展开，请参阅[文件路径模式](../../basic-element/file-path-pattern)。注意：如果使用通配符，需要保证各个文件的格式一致（尤其是
 `csv`、`csv_with_names`、`csv_with_names_and_types` 属于不同的格式），S3 TVF 会使用第一个文件来解析 
Table Schema。
   
   如下两个 CSV 文件：
 
diff --git a/sidebars.ts b/sidebars.ts
index 15e7ed3e1ce..cfce6a2f6bf 100644
--- a/sidebars.ts
+++ b/sidebars.ts
@@ -1169,6 +1169,7 @@ const sidebars: SidebarsConfig = {
                         'sql-manual/basic-element/reserved-keywords',
                         'sql-manual/basic-element/variables',
                         'sql-manual/basic-element/comments',
+                        'sql-manual/basic-element/file-path-pattern',
                         {
                             type: 'category',
                             label: 'Operators',
diff --git 
a/versioned_docs/version-4.x/data-operate/import/import-way/broker-load-manual.md
 
b/versioned_docs/version-4.x/data-operate/import/import-way/broker-load-manual.md
index f292219832e..89bbba9fe15 100644
--- 
a/versioned_docs/version-4.x/data-operate/import/import-way/broker-load-manual.md
+++ 
b/versioned_docs/version-4.x/data-operate/import/import-way/broker-load-manual.md
@@ -24,6 +24,12 @@ Supported data sources:
 - HDFS protocol
 - Custom protocol (require broker process)
 
+Supported file path patterns:
+
+- Wildcards: `*`, `?`, `[abc]`, `[a-z]`
+- Range expansion: `{1..10}`, `{a,b,c}`
+- See [File Path Pattern](../../../sql-manual/basic-element/file-path-pattern) 
for complete syntax
+
 Supported data types:
 
 - CSV
@@ -558,6 +564,8 @@ Different Broker types and access methods require different 
authentication infor
 
 ### Importing data from HDFS using wildcards to match two batches of files and 
importing them into two separate tables
 
+  Broker Load supports wildcards (`*`, `?`, `[...]`) and range patterns 
(`{1..10}`) in file paths. For detailed syntax, see [File Path 
Pattern](../../../sql-manual/basic-element/file-path-pattern).
+
   ```sql
   LOAD LABEL example_db.label2
   (
diff --git 
a/versioned_docs/version-4.x/data-operate/import/import-way/insert-into-manual.md
 
b/versioned_docs/version-4.x/data-operate/import/import-way/insert-into-manual.md
index ae76125ec16..9a287c7b303 100644
--- 
a/versioned_docs/version-4.x/data-operate/import/import-way/insert-into-manual.md
+++ 
b/versioned_docs/version-4.x/data-operate/import/import-way/insert-into-manual.md
@@ -312,7 +312,9 @@ The INSERT command is a synchronous command. If it returns 
a result, that indica
 
 ## Ingest data by TVF
 
-Doris can directly query and analyze files stored in object storage or HDFS as 
tables through the Table Value Functions (TVFs), which supports automatic 
column type inference. For detailed information, please refer to the 
[Lakehouse/TVF 
documentation](https://doris.apache.org/docs/3.0/lakehouse/file-analysis).
+Doris can directly query and analyze files stored in object storage or HDFS as 
tables through the Table Value Functions (TVFs), which supports automatic 
column type inference. For detailed information, please refer to the 
[Lakehouse/TVF documentation](../../../lakehouse/file-analysis).
+
+TVF supports wildcards (`*`, `?`, `[...]`) and range patterns (`{1..10}`) in 
file paths. For complete syntax, see [File Path 
Pattern](../../../sql-manual/basic-element/file-path-pattern).
 
 ### Automatic column type inference
 
diff --git a/versioned_docs/version-4.x/lakehouse/file-analysis.md 
b/versioned_docs/version-4.x/lakehouse/file-analysis.md
index c4b1d614c05..d739a51f71f 100644
--- a/versioned_docs/version-4.x/lakehouse/file-analysis.md
+++ b/versioned_docs/version-4.x/lakehouse/file-analysis.md
@@ -39,21 +39,15 @@ The attributes of a TVF include the file path to be 
analyzed, file format, conne
 
 ### Multiple File Import
 
-When importing, the file path (URI) supports wildcards for matching. Doris 
file path matching uses the [Glob matching 
pattern](https://en.wikipedia.org/wiki/Glob_(programming)#:~:text=glob%20%28%29%20%28%2F%20%C9%A1l%C9%92b%20%2F%29%20is%20a%20libc,into%20a%20list%20of%20names%20matching%20that%20pattern.),
 and has been extended on this basis to support more flexible file selection 
methods.
+The file path (URI) supports wildcards and range patterns for matching 
multiple files:
 
-- `file_{1..3}`: Matches files `file_1`, `file_2`, `file_3`
-- `file_{1,3}_{1,2}`: Matches files `file_1_1`, `file_1_2`, `file_3_1`, 
`file_3_2` (supports mixing with `{n..m}` notation, separated by commas)
-- `file_*`: Matches all files starting with `file_`
-- `*.parquet`: Matches all files with the `.parquet` suffix
-- `tvf_test/*`: Matches all files in the `tvf_test` directory
-- `*test*`: Matches files containing `test` in the filename
+| Pattern | Example | Matches |
+|---------|---------|---------|
+| `*` | `file_*` | All files starting with `file_` |
+| `{n..m}` | `file_{1..3}` | `file_1`, `file_2`, `file_3` |
+| `{a,b,c}` | `file_{a,b}` | `file_a`, `file_b` |
 
-**Notes**
-
-- In the `{1..3}` notation, the order can be reversed, `{3..1}` is also valid.
-- Notations like `file_{-1..2}` and `file_{a..4}` are not supported, as 
negative numbers or letters cannot be used as enumeration endpoints. However, 
`file_{1..3,11,a}` is allowed and will match files `file_1`, `file_2`, 
`file_3`, `file_11`, and `file_a`.
-- Doris tries to import as many files as possible. For paths like 
`file_{a..b,-1..3,4..5}` that contain incorrect notation, we will match files 
`file_4` and `file_5`.
-- When using commas with `{1..4,5}`, only numbers are allowed. Expressions 
like `{1..4,a}` are not supported; in this case, `{a}` will be ignored.
+For complete syntax including all supported wildcards, range expansion rules, 
and usage examples, see [File Path 
Pattern](../sql-manual/basic-element/file-path-pattern).
 
 
 ### Automatic Inference of File Column Types
diff --git 
a/versioned_docs/version-4.x/sql-manual/basic-element/file-path-pattern.md 
b/versioned_docs/version-4.x/sql-manual/basic-element/file-path-pattern.md
new file mode 100644
index 00000000000..77bcc96e99a
--- /dev/null
+++ b/versioned_docs/version-4.x/sql-manual/basic-element/file-path-pattern.md
@@ -0,0 +1,309 @@
+---
+{
+    "title": "File Path Pattern",
+    "language": "en",
+    "description": "File path patterns and wildcards supported by Doris for 
accessing files in remote storage systems like S3, HDFS, and other object 
storage."
+}
+---
+
+## Description
+
+When accessing files from remote storage systems (S3, HDFS, and other 
S3-compatible object storage), Doris supports flexible file path patterns 
including wildcards and range expressions. This document describes the 
supported path formats and pattern matching syntax.
+
+These path patterns are supported by:
+- [S3 TVF](../sql-functions/table-valued-functions/s3)
+- [HDFS TVF](../sql-functions/table-valued-functions/hdfs)
+- [Broker Load](../../data-operate/import/import-way/broker-load-manual)
+- INSERT INTO SELECT from TVF
+
+## Supported URI Formats
+
+### S3-Style URIs
+
+| Style | Format | Example |
+|-------|--------|---------|
+| AWS Client Style (Hadoop S3) | `s3://bucket/path/to/file` | 
`s3://my-bucket/data/file.csv` |
+| S3A Style | `s3a://bucket/path/to/file` | `s3a://my-bucket/data/file.csv` |
+| S3N Style | `s3n://bucket/path/to/file` | `s3n://my-bucket/data/file.csv` |
+| Virtual Host Style | `https://bucket.endpoint/path/to/file` | 
`https://my-bucket.s3.us-west-1.amazonaws.com/data/file.csv` |
+| Path Style | `https://endpoint/bucket/path/to/file` | 
`https://s3.us-west-1.amazonaws.com/my-bucket/data/file.csv` |
+
+### Other Cloud Storage URIs
+
+| Provider | Scheme | Example |
+|----------|--------|---------|
+| Alibaba Cloud OSS | `oss://` | `oss://my-bucket/data/file.csv` |
+| Tencent Cloud COS | `cos://`, `cosn://` | `cos://my-bucket/data/file.csv` |
+| Baidu Cloud BOS | `bos://` | `bos://my-bucket/data/file.csv` |
+| Huawei Cloud OBS | `obs://` | `obs://my-bucket/data/file.csv` |
+| Google Cloud Storage | `gs://` | `gs://my-bucket/data/file.csv` |
+| Azure Blob Storage | `azure://` | `azure://container/data/file.csv` |
+
+### HDFS URIs
+
+| Style | Format | Example |
+|-------|--------|---------|
+| Standard | `hdfs://namenode:port/path/to/file` | 
`hdfs://namenode:8020/user/data/file.csv` |
+| HA Mode | `hdfs://nameservice/path/to/file` | 
`hdfs://my-ha-cluster/user/data/file.csv` |
+
+## Wildcard Patterns
+
+Doris uses glob-style pattern matching for file paths. The following wildcards 
are supported:
+
+### Basic Wildcards
+
+| Pattern | Description | Example | Matches |
+|---------|-------------|---------|---------|
+| `*` | Matches zero or more characters within a path segment | `*.csv` | 
`file.csv`, `data.csv`, `a.csv` |
+| `?` | Matches exactly one character | `file?.csv` | `file1.csv`, 
`fileA.csv`, but not `file10.csv` |
+| `[abc]` | Matches any single character in brackets | `file[123].csv` | 
`file1.csv`, `file2.csv`, `file3.csv` |
+| `[a-z]` | Matches any single character in the range | `file[a-c].csv` | 
`filea.csv`, `fileb.csv`, `filec.csv` |
+| `[!abc]` | Matches any single character NOT in brackets | `file[!0-9].csv` | 
`filea.csv`, `fileb.csv`, but not `file1.csv` |
+
+### Range Expansion (Brace Patterns)
+
+Doris supports numeric range expansion using brace patterns `{start..end}`:
+
+| Pattern | Expansion | Matches |
+|---------|-----------|---------|
+| `{1..3}` | `{1,2,3}` | `1`, `2`, `3` |
+| `{01..05}` | `{1,2,3,4,5}` | `1`, `2`, `3`, `4`, `5` (leading zeros are NOT 
preserved) |
+| `{3..1}` | `{1,2,3}` | `1`, `2`, `3` (reverse ranges supported) |
+| `{a,b,c}` | `{a,b,c}` | `a`, `b`, `c` (enumeration) |
+| `{1..3,5,7..9}` | `{1,2,3,5,7,8,9}` | Mixed ranges and values |
+
+:::caution Note
+- Doris tries to match as many files as possible. Invalid parts in brace 
expressions are silently skipped, and valid parts are still expanded. For 
example, `file_{a..b,-1..3,4..5}` will match `file_4` and `file_5` (the invalid 
`a..b` and negative range `-1..3` are skipped, but `4..5` is expanded normally).
+- If the entire range is negative (e.g., `{-1..2}`), the range is skipped. If 
mixed with valid ranges (e.g., `{-1..2,1..3}`), only the valid range `1..3` is 
expanded.
+- When using comma-separated values with ranges, only numbers are allowed. For 
example, in `{1..4,a}`, the non-numeric `a` will be ignored, resulting in 
`{1,2,3,4}`.
+- Pure enumeration patterns like `{a,b,c}` (without `..` ranges) are passed 
directly to glob matching and work as expected.
+:::
+
+### Combining Patterns
+
+Multiple patterns can be combined in a single path:
+
+```
+s3://bucket/data_{1..3}/file_*.csv
+```
+
+This matches:
+- `s3://bucket/data_1/file_a.csv`
+- `s3://bucket/data_1/file_b.csv`
+- `s3://bucket/data_2/file_a.csv`
+- ... and so on
+
+## Examples
+
+### S3 TVF Examples
+
+**Match all CSV files in a directory:**
+
+```sql
+SELECT * FROM S3(
+    "uri" = "s3://my-bucket/data/*.csv",
+    "s3.access_key" = "xxx",
+    "s3.secret_key" = "xxx",
+    "s3.region" = "us-east-1",
+    "format" = "csv"
+);
+```
+
+**Match files with numeric range:**
+
+```sql
+SELECT * FROM S3(
+    "uri" = "s3://my-bucket/logs/data_{1..10}.csv",
+    "s3.access_key" = "xxx",
+    "s3.secret_key" = "xxx",
+    "s3.region" = "us-east-1",
+    "format" = "csv"
+);
+```
+
+**Match files in date-partitioned directories:**
+
+```sql
+SELECT * FROM S3(
+    "uri" = "s3://my-bucket/logs/year=2024/month=*/day=*/data.parquet",
+    "s3.access_key" = "xxx",
+    "s3.secret_key" = "xxx",
+    "s3.region" = "us-east-1",
+    "format" = "parquet"
+);
+```
+
+:::caution Zero-Padded Directories
+For zero-padded directory names like `month=01`, `month=02`, use wildcards 
(`*`) instead of range patterns. The pattern `{01..12}` expands to 
`{1,2,...,12}` which won't match `month=01`.
+:::
+
+**Match numbered file splits (e.g., Spark output):**
+
+```sql
+SELECT * FROM S3(
+    "uri" = "s3://my-bucket/output/part-{00000..00099}.csv",
+    "s3.access_key" = "xxx",
+    "s3.secret_key" = "xxx",
+    "s3.region" = "us-east-1",
+    "format" = "csv"
+);
+```
+
+### Broker Load Examples
+
+**Load all CSV files matching a pattern:**
+
+```sql
+LOAD LABEL db.label_wildcard
+(
+    DATA INFILE("s3://my-bucket/data/file_*.csv")
+    INTO TABLE my_table
+    COLUMNS TERMINATED BY ","
+    FORMAT AS "CSV"
+    (col1, col2, col3)
+)
+WITH S3 (
+    "provider" = "S3",
+    "AWS_ENDPOINT" = "s3.us-west-2.amazonaws.com",
+    "AWS_ACCESS_KEY" = "xxx",
+    "AWS_SECRET_KEY" = "xxx",
+    "AWS_REGION" = "us-west-2"
+);
+```
+
+**Load files using numeric range expansion:**
+
+```sql
+LOAD LABEL db.label_range
+(
+    DATA INFILE("s3://my-bucket/exports/data_{1..5}.csv")
+    INTO TABLE my_table
+    COLUMNS TERMINATED BY ","
+    FORMAT AS "CSV"
+    (col1, col2, col3)
+)
+WITH S3 (
+    "provider" = "S3",
+    "AWS_ENDPOINT" = "s3.us-west-2.amazonaws.com",
+    "AWS_ACCESS_KEY" = "xxx",
+    "AWS_SECRET_KEY" = "xxx",
+    "AWS_REGION" = "us-west-2"
+);
+```
+
+**Load from HDFS with wildcards:**
+
+```sql
+LOAD LABEL db.label_hdfs_wildcard
+(
+    DATA INFILE("hdfs://namenode:8020/user/data/2024-*/*.csv")
+    INTO TABLE my_table
+    COLUMNS TERMINATED BY ","
+    FORMAT AS "CSV"
+    (col1, col2, col3)
+)
+WITH HDFS (
+    "fs.defaultFS" = "hdfs://namenode:8020",
+    "hadoop.username" = "user"
+);
+```
+
+**Load from HDFS with numeric range:**
+
+```sql
+LOAD LABEL db.label_hdfs_range
+(
+    DATA INFILE("hdfs://namenode:8020/data/file_{1..3,5,7..9}.csv")
+    INTO TABLE my_table
+    COLUMNS TERMINATED BY ","
+    FORMAT AS "CSV"
+    (col1, col2, col3)
+)
+WITH HDFS (
+    "fs.defaultFS" = "hdfs://namenode:8020",
+    "hadoop.username" = "user"
+);
+```
+
+### INSERT INTO SELECT Examples
+
+**Insert from S3 with wildcards:**
+
+```sql
+INSERT INTO my_table (col1, col2, col3)
+SELECT * FROM S3(
+    "uri" = "s3://my-bucket/data/part-*.parquet",
+    "s3.access_key" = "xxx",
+    "s3.secret_key" = "xxx",
+    "s3.region" = "us-east-1",
+    "format" = "parquet"
+);
+```
+
+## Performance Considerations
+
+### Use Specific Prefixes
+
+Doris extracts the longest non-wildcard prefix from your path pattern to 
optimize S3/HDFS listing operations. More specific prefixes result in faster 
file discovery.
+
+```sql
+-- Good: specific prefix reduces listing scope
+"uri" = "s3://bucket/data/2024/01/15/*.csv"
+
+-- Less optimal: broad wildcard at early path segment
+"uri" = "s3://bucket/data/**/file.csv"
+```
+
+### Prefer Range Patterns for Known Sequences
+
+When you know the exact file numbering, use range patterns instead of 
wildcards:
+
+```sql
+-- Better: explicit range
+"uri" = "s3://bucket/data/part-{0001..0100}.csv"
+
+-- Less optimal: wildcard matches unknown files
+"uri" = "s3://bucket/data/part-*.csv"
+```
+
+### Avoid Deep Recursive Wildcards
+
+Deep recursive patterns like `**` can cause slow file listing on large buckets:
+
+```sql
+-- Avoid when possible
+"uri" = "s3://bucket/**/*.csv"
+
+-- Prefer explicit path structure
+"uri" = "s3://bucket/data/year=*/month=*/day=*/*.csv"
+```
+
+## Troubleshooting
+
+| Issue | Cause | Solution |
+|-------|-------|----------|
+| No files found | Pattern doesn't match any files | Verify the path and 
pattern syntax; test with a single file first |
+| Slow file listing | Wildcard too broad or too many files | Use more specific 
prefix; limit wildcard scope |
+| Invalid URI error | Malformed path syntax | Check URI scheme and bucket name 
format |
+| Access denied | Credentials or permissions issue | Verify S3/HDFS 
credentials and bucket policies |
+
+### Testing Path Patterns
+
+Before running a large load job, test your pattern with a limited query:
+
+```sql
+-- Test if files exist and match pattern
+SELECT * FROM S3(
+    "uri" = "s3://bucket/your/pattern/*.csv",
+    ...
+) LIMIT 1;
+```
+
+Use `DESC FUNCTION` to verify the schema of matched files:
+
+```sql
+DESC FUNCTION S3(
+    "uri" = "s3://bucket/your/pattern/*.csv",
+    ...
+);
+```
diff --git 
a/versioned_docs/version-4.x/sql-manual/sql-functions/table-valued-functions/s3.md
 
b/versioned_docs/version-4.x/sql-manual/sql-functions/table-valued-functions/s3.md
index 07e5e4db552..368b5682181 100644
--- 
a/versioned_docs/version-4.x/sql-manual/sql-functions/table-valued-functions/s3.md
+++ 
b/versioned_docs/version-4.x/sql-manual/sql-functions/table-valued-functions/s3.md
@@ -31,7 +31,7 @@ S3(
 
 | Parameter       | Description                                                
                                                                           |
 
|-----------------|---------------------------------------------------------------------------------------------------------------------------------------|
-| `uri`           | URI for accessing S3. The function will use either Path 
Style or Virtual-hosted Style based on the `use_path_style` parameter        |
+| `uri`           | URI for accessing S3. Supports wildcards and range 
patterns. See [File Path Pattern](../../basic-element/file-path-pattern) for 
details. The function will use either Path Style or Virtual-hosted Style based 
on the `use_path_style` parameter        |
 | `s3.access_key` | Access key for S3                                          
                                                                           |
 | `s3.secret_key` | Secret key for S3                                          
                                                                           |
 | `s3.region`     | S3 region                                                  
                                                                           |
@@ -530,7 +530,7 @@ S3(
 
 - **URI with Wildcards**
 
-  URI can use wildcards to read multiple files. Note: When using wildcards, 
ensure all files have the same format (especially `csv`, `csv_with_names`, 
`csv_with_names_and_types` are different formats). S3 TVF will use the first 
file to parse Table Schema.
+  URI can use wildcards and range patterns to read multiple files. For 
detailed syntax including `*`, `?`, `[...]`, and `{1..10}` range expansion, see 
[File Path Pattern](../../basic-element/file-path-pattern). Note: When using 
wildcards, ensure all files have the same format (especially `csv`, 
`csv_with_names`, `csv_with_names_and_types` are different formats). S3 TVF 
will use the first file to parse Table Schema.
   
   With the following two CSV files:
 
diff --git a/versioned_sidebars/version-4.x-sidebars.json 
b/versioned_sidebars/version-4.x-sidebars.json
index 58565161378..b00e368e0f1 100644
--- a/versioned_sidebars/version-4.x-sidebars.json
+++ b/versioned_sidebars/version-4.x-sidebars.json
@@ -1173,6 +1173,7 @@
                         "sql-manual/basic-element/reserved-keywords",
                         "sql-manual/basic-element/variables",
                         "sql-manual/basic-element/comments",
+                        "sql-manual/basic-element/file-path-pattern",
                         {
                             "type": "category",
                             "label": "Operators",


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

(doris-website) branch master updated: [docs] Add file path pattern documentation for S3 TVF and Broker Load (#3337)

Reply via email to