Copilot commented on code in PR #3337:
URL: https://github.com/apache/doris-website/pull/3337#discussion_r2765784880
##########
docs/sql-manual/basic-element/file-path-pattern.md:
##########
@@ -0,0 +1,303 @@
+---
+{
+ "title": "File Path Pattern",
+ "language": "en",
+ "description": "File path patterns and wildcards supported by Doris for
accessing files in remote storage systems like S3, HDFS, and other object
storage."
+}
+---
+
+## Description
+
+When accessing files from remote storage systems (S3, HDFS, and other
S3-compatible object storage), Doris supports flexible file path patterns
including wildcards and range expressions. This document describes the
supported path formats and pattern matching syntax.
+
+These path patterns are supported by:
+- [S3 TVF](../sql-functions/table-valued-functions/s3)
+- [HDFS TVF](../sql-functions/table-valued-functions/hdfs)
+- [Broker Load](../../data-operate/import/import-way/broker-load-manual)
+- INSERT INTO SELECT from TVF
+
+## Supported URI Formats
+
+### S3-Style URIs
+
+| Style | Format | Example |
+|-------|--------|---------|
+| AWS Client Style (Hadoop S3) | `s3://bucket/path/to/file` |
`s3://my-bucket/data/file.csv` |
+| S3A Style | `s3a://bucket/path/to/file` | `s3a://my-bucket/data/file.csv` |
+| S3N Style | `s3n://bucket/path/to/file` | `s3n://my-bucket/data/file.csv` |
+| Virtual Host Style | `https://bucket.endpoint/path/to/file` |
`https://my-bucket.s3.us-west-1.amazonaws.com/data/file.csv` |
+| Path Style | `https://endpoint/bucket/path/to/file` |
`https://s3.us-west-1.amazonaws.com/my-bucket/data/file.csv` |
+
+### Other Cloud Storage URIs
+
+| Provider | Scheme | Example |
+|----------|--------|---------|
+| Alibaba Cloud OSS | `oss://` | `oss://my-bucket/data/file.csv` |
+| Tencent Cloud COS | `cos://`, `cosn://` | `cos://my-bucket/data/file.csv` |
+| Baidu Cloud BOS | `bos://` | `bos://my-bucket/data/file.csv` |
+| Huawei Cloud OBS | `obs://` | `obs://my-bucket/data/file.csv` |
+| Google Cloud Storage | `gs://` | `gs://my-bucket/data/file.csv` |
+| Azure Blob Storage | `azure://` | `azure://container/data/file.csv` |
+
+### HDFS URIs
+
+| Style | Format | Example |
+|-------|--------|---------|
+| Standard | `hdfs://namenode:port/path/to/file` |
`hdfs://namenode:8020/user/data/file.csv` |
+| HA Mode | `hdfs://nameservice/path/to/file` |
`hdfs://my-ha-cluster/user/data/file.csv` |
+
+## Wildcard Patterns
+
+Doris uses glob-style pattern matching for file paths. The following wildcards
are supported:
+
+### Basic Wildcards
+
+| Pattern | Description | Example | Matches |
+|---------|-------------|---------|---------|
+| `*` | Matches zero or more characters within a path segment | `*.csv` |
`file.csv`, `data.csv`, `a.csv` |
+| `?` | Matches exactly one character | `file?.csv` | `file1.csv`,
`fileA.csv`, but not `file10.csv` |
+| `[abc]` | Matches any single character in brackets | `file[123].csv` |
`file1.csv`, `file2.csv`, `file3.csv` |
+| `[a-z]` | Matches any single character in the range | `file[a-c].csv` |
`filea.csv`, `fileb.csv`, `filec.csv` |
+| `[!abc]` | Matches any single character NOT in brackets | `file[!0-9].csv` |
`filea.csv`, `fileb.csv`, but not `file1.csv` |
+
+### Range Expansion (Brace Patterns)
+
+Doris supports numeric range expansion using brace patterns `{start..end}`:
+
+| Pattern | Expansion | Matches |
+|---------|-----------|---------|
+| `{1..3}` | `{1,2,3}` | `1`, `2`, `3` |
+| `{01..05}` | `{1,2,3,4,5}` | `1`, `2`, `3`, `4`, `5` (leading zeros
stripped) |
+| `{3..1}` | `{1,2,3}` | `1`, `2`, `3` (reverse ranges supported) |
Review Comment:
The description for brace range `{01..05}` indicates that leading zeros are
stripped (`{1,2,3,4,5}`), but later examples (e.g., `month={01..12}` and
`part-{00000..00099}.csv`) rely on zero-padded values being preserved in the
expanded filenames. This is internally inconsistent documentation: either the
note about stripping leading zeros or the later examples are incorrect. Please
update the behavior description and/or the examples so they accurately reflect
how Doris actually expands zero-padded numeric ranges.
```suggestion
| `{01..05}` | `{01,02,03,04,05}` | `01`, `02`, `03`, `04`, `05` (leading
zeros preserved) |
| `{3..1}` | `{3,2,1}` | `3`, `2`, `1` (reverse ranges supported) |
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]